Cover Image for AI Evals w/ Alex Gu β€” Evaluating AI Systems on Mathematical and Coding Tasks
Cover Image for AI Evals w/ Alex Gu β€” Evaluating AI Systems on Mathematical and Coding Tasks

AI Evals w/ Alex Gu β€” Evaluating AI Systems on Mathematical and Coding Tasks

Hosted by Vals AI & alphaXiv
Registration
Welcome! To join the event, please register below.
About Event

​About Event

β€‹β€‹β€‹β€‹β€‹β€‹β€‹πŸ”¬ AI Evals on AlphaXiv

β€‹πŸ—“ Wednesday, November 5th 2025 Β· 11AM PT

β€‹πŸŽ™ Featuring Alex Gu

β€‹πŸ’¬ Moderated Discussion + Q&A

​AI Evals Series: Evaluating AI Systems on Mathematical and Coding Tasks

​We’re excited to host Alex Gu, a PhD student at MIT whose research focuses on evaluating and improving AI systems on programming and mathematical reasoning. In this session, Alex will share insights from his work on widely-used benchmarks and tools, such as LiveCodeBench, LeanDojo, IneqMath, CruxEval, and more. He’ll also discuss how these evaluations inform our understanding of AI capabilities, and explore the future of training and assessing AI models on math and code tasks.

​This event is virtural. The zoom link will be shared upon registration. The talk will later be uploaded to AlphaXiv’s YouTube Channel

​Hosted by: alphaXiv x Vals AI

​​​​​​​AI Evals: join the community

Location
Zoom