AI Evals w/ Alex Gu β Evaluating AI Systems on Mathematical and Coding Tasks
βAbout Event
βββββββπ¬ AI Evals on AlphaXiv
βπ Wednesday, November 5th 2025 Β· 11AM PT
βπ Featuring Alex Gu
βπ¬ Moderated Discussion + Q&A
βAI Evals Series: Evaluating AI Systems on Mathematical and Coding Tasks
βWeβre excited to host Alex Gu, a PhD student at MIT whose research focuses on evaluating and improving AI systems on programming and both formal and informal mathematical reasoning. Alex has been involved in creating widely-used benchmarks and tools, such as LiveCodeBench, BigCodeBench, CRUXEval, LeanDojo, IneqMath, and more. In this session, Alex will share insights on how evaluations can inform our perspective on AI capabilities and explore today's challenges of AI models on math and code tasks.
βThis event is virtural. The zoom link will be shared upon registration. The talk will later be uploaded to AlphaXivβs YouTube Channel
βHosted by: alphaXiv x Vals AI
βββββββAI Evals: join the community
