AI Evals w/ Ben Rank and Hardik Bhatnagar

Hosted by Vals AI & alphaXiv

Zoom

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

About Event

🔬 AI Evals on AlphaXiv

🗓 Thursday, February 5th · 11AM PT

🎙 Featuring Ben Rank and Hardik Bhatnagar

💬 Talk + Q&A

AI Evals Series: Can AI agents autonomously post-train LLMs?

We're excited to host Ben Rank and Hardik Bhatnagar to discuss PostTrainBench, a benchmark measuring whether CLI agents can autonomously post-train LLMs.

Ben Rank is a PhD student at the Max Planck Institute, advised by Maksym Andriushchenko. His research focuses on AI safety and risks from automated AI research.

Hardik Bhatnagar is a PhD student at the Tübingen AI Center, advised by Matthias Bethge and Maksym Andriushchenko. He works on understanding LLM capabilities and failure modes, with a focus on long-horizon evaluation and safety.

They will present PostTrainBench and discuss their thoughts on the evals space more broadly.

This event is virtual. The zoom link will be shared upon registration. The talk will later be uploaded to AlphaXiv’s YouTube Channel

Hosted by: alphaXiv x Vals AI

AI Evals: join the community

Hosted By

53 Going

AI