

Terminal-Bench with Mike Merrill (Anthropic)
Join us for our third reading group event (dinner included!) in SF at the Gradient office with Mike Merrill, who works on evals at Anthropic!
In our previous two reading groups, we had researchers from Physical Intelligence and Periodic Labs, who presented their research in the VLA and AI for Material Science domains.
For this session, we'll be discussing a breakthrough project in the benchmark domain:
Terminal-Bench: A Benchmark for AI Agents in Terminal Environments (link)
We'll gather ~25-30 folks of different research backgrounds together to go through the key sections of the research project, discuss results, and understand important research outcomes.
We'll have food/drinks provided as well! Make sure to sign up quickly, since this event fills up extremely fast.