AI Evals w/ Shashwat Goel: Measuring AI progress requires rethinking evaluations
About Event
🔬 AI Evals on alphaXiv
🗓 Thursday October 2nd 2025 · 11AM PT
🎙 Featuring Shashwat Goel
💬 Casual Talk + Open Discussion
We are excited to have Shashwat Goel to discuss how AI evaluations need to change in tandem with LLM capabilities. He will present his work on generative, and long horizon evaluations in the era of reasoning agents, and perspectives on new capability evaluations we need towards generally intelligent agents. Shashwat is a PhD student co-advised by Jonas Geiping and Douwe Kiela through the ELLIS program at the Max Planck Institute for Intelligent Systems.
🎥 Zoom: https://stanford.zoom.us/j/91936389736?pwd=GI5Kibcjl6UaN9IQOLShBthyaiOIbL.1&from=addon
Hosted by: alphaXiv x Vals AI
AI Evals: join the community
