Cover Image for AI Evals w/ Ben Rank and Hardik Bhatnagar
Cover Image for AI Evals w/ Ben Rank and Hardik Bhatnagar
53 Going

AI Evals w/ Ben Rank and Hardik Bhatnagar

Hosted by Vals AI & alphaXiv
Zoom
Registration
Welcome! To join the event, please register below.
About Event

About Event

​​​​​​​🔬 AI Evals on AlphaXiv

​🗓 Thursday, February 5th · 11AM PT

​🎙 Featuring Ben Rank and Hardik Bhatnagar

​💬 Talk + Q&A

AI Evals Series: Can AI agents autonomously post-train LLMs?

We're excited to host Ben Rank and Hardik Bhatnagar to discuss PostTrainBench, a benchmark measuring whether CLI agents can autonomously post-train LLMs.

Ben Rank is a PhD student at the Max Planck Institute, advised by Maksym Andriushchenko. His research focuses on AI safety and risks from automated AI research.

Hardik Bhatnagar is a PhD student at the Tübingen AI Center, advised by Matthias Bethge and Maksym Andriushchenko. He works on understanding LLM capabilities and failure modes, with a focus on long-horizon evaluation and safety.

They will present PostTrainBench and discuss their thoughts on the evals space more broadly.

This event is virtual. The zoom link will be shared upon registration. The talk will later be uploaded to AlphaXiv’s YouTube Channel

Hosted by: alphaXiv x Vals AI

​​​​​​AI Evals: join the community

53 Going