Avatar for Frontier Syndicate
Presented by
Frontier Syndicate

Bay Area Frontier Research Club #8 | Stanford University (dinner + paper discussion)

Register to See Address
Stanford, CA
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Self-Improving Agents, Multi-Agent Constitution Learning, Research-Grounded AI & Benchmarking Science Agents. Frontier Research Talks + Rigorous Q&A.

Vignesh Baskaran (Hexo Labs) on SIA, Hexo's open-source self-improving agent framework, and what it looks like in production; Rushil Thareja (MBZUAI) on MAC — built with Google DeepMind — a multi-agent system that learns to write and refine its own rulebook; Kalpit Dixit (Paper Lantern, ex-AWS Bedrock) on what changes when AI agents are grounded in 2M+ research papers; and a lightning talk from Steven Dillmann (Stanford PhD, AI for Science) introducing Terminal-Bench Science, a benchmark for evaluating AI agents on real-world computational workflows across the natural sciences.

As always, presentations are kept brief so the room can do what it does best — push hard on the work.

The Bay Area Frontier Research Club is a curated forum for rigorous discussion on how AI is reshaping the scientific research process. We convene experimental researchers, computational scientists, and research engineers across domains to examine concrete work—papers, methods, and workflows—covering literature synthesis, hypothesis generation, experimental design, simulation, analysis, and reproducibility.

For each session, we curate 2–4 papers selected for rigor and discussion value. Presentations are intentionally brief so the majority of time is reserved for questions and critique: assumptions, evaluation methodology, failure modes, and what would constitute convincing evidence. Papers and supporting materials are shared in advance to ensure a high-baseline conversation.

Agenda

5:30pm: Doors open
5:30pm – 6:30pm: Networking + light dinner
6:30pm – 8:00pm: Research presentations + discussion
8:00pm – 8:30pm: Networking

Presenters & topics

Talk #1: Self-Improving Agents

Vignesh Baskaran is co-founder and CTO of Hexo Labs, building the research-infrastructure layer for the next generation of agent systems. SIA is Hexo's self-improving agent framework — going open-source on the night — a system for agents that learn from their own experiments, evaluate their own progress, and refine their methods over time.

SIA sets the throughline for the night: what it actually looks like to build agents that improve themselves in production. The two talks that follow extend the arc — into how those agents govern themselves, and into how they ground themselves in the world's research. Vignesh gives the room a full look at the framework — the architecture, what it has enabled so far, and where it goes next.


Talk #2 (Lightning Talk): Terminal-Bench Science

Steven Dillmann is a PhD student at Stanford University working on AI for Science, co-advised by Sanmi Koyejo (Stanford CS) and Risa Wechsler (Stanford Physics). His recent work includes the Evo 2 genomic foundation model (BioRxiv 2025) and first-authored research on representation learning for time-domain high-energy astrophysics, published in MNRAS. He is affiliated with the Stanford AI Lab, Stanford ICME, KIPAC, and SLAC, and holds an MPhil in Data Intensive Science from Cambridge and an MEng in Aerospace Engineering from Imperial College London. Prior research appointments include Harvard, NASA JPL, ESA, and DLR.

AI agents are evolving from coding assistants to scientific collaborators, yet no rigorous benchmark exists to measure whether they can reliably execute the complex computational workflows that natural scientists carry out in practice. Terminal-Bench Science (TB-Science) is a real-world, cross-disciplinary, open verifiable benchmark for evaluating AI agents on computational workflows across the natural sciences. Building on the Terminal-Bench methodology, TB-Science targets 100+ expert-authored computational tasks running in containerized environments with deterministic verification, spanning the life, physical, earth, and mathematical sciences. Steven is presenting at the contribution-call stage — looking for collaborators across the natural sciences to author tasks.

REVIEW THE PRE-READ HERE


Talk #3: Multi-Agent Constitution Learning

Rushil Thareja is a PhD researcher at MBZUAI advised by Prof. Nils Lukas and Prof. Praneeth Vepakomma, focused on private and secure AI agents, with first-authored work at ICLR and ACL. He is also the founding engineer of Agentoid, where he leads work on agentic retrieval.

MAC — built in collaboration with Google DeepMind — tackles a deceptively hard question: can an AI system learn to write its own rulebook? Constitutional AI governs model behavior with a set of natural-language rules — but those rules are normally hand-written by experts, and existing prompt optimizers produce long, brittle prompts that are hard to audit. MAC instead uses a network of specialized agents that iteratively propose, test, accept, and reject explicit, auditable rule updates, keeping only the changes that measurably improve performance. The result outperforms recent prompt-optimization methods by 50%+, matches supervised fine-tuning and GRPO without any parameter updates, and produces rule sets that stay human-readable and auditable.

REVIEW THE PRE-READ HERE


Talk #4: Research-Grounded AI Agents

Kalpit Dixit is the founder of Paper Lantern. Previously, he was a Senior Applied Scientist at AWS Bedrock, where he led teams delivering trillions of pretraining tokens, RAG capabilities, and multiple AI products into GA. Prior to AWS, he was a High Frequency Options Trader at Optiver (NL). He holds an MS from Stanford and a BS+MS from IIT Bombay. Paper Lantern creates technology that gives AI agents structured access to the world's research literature — their MCP server distills 2M+ research papers, with methods, tradeoffs, benchmarks, and implementation guidance flowing directly into the reasoning loop of all popular coding and chat agents.

The case for grounding agents in the literature is open-sourced and falsifiable. Paper Lantern's Autoresearch work shows large improvements in LLM pretraining: public results show a 10% reduction in cost and a 3.2% drop in loss, with the same gains scaling at 100x pretraining compute. The agent stops reaching for the standard playbook and starts from the research frontier. Wednesday's presentation also covers unpublished internal content.

REVIEW THE PRE-READ HERE


Want to present your work?

If you have a research paper you’d like to discuss at one of our next sessions, please submit it for consideration.

SUBMIT YOUR PAPER HERE.


Who should attend

  • Experimental researchers

  • Computational scientists across domains (bio/chem/materials/climate/neuro/physics)

  • Research engineers + lab automation people

  • Folks building tools for literature review, experiment planning, robotics, simulation, or scientific data

If you’ve ever wished research moved faster, you belong here.

Capacity is limited.

We will take photos and short video clips for event recap and promotion. By attending, you consent to being photographed and recorded, and to the use of those images and clips by the organizers on social media and other event marketing channels.


Last Session Recap — Google Ventures, May 20

Our seventh session packed 75+ researchers, founders, and investors into Google Ventures for four frontier talks and Q&A:

  • Bonnie Li (Google DeepMind) — scaling RL compute for LLMs

  • Kanishk Gandhi (Stanford) — the cognitive behaviors that let models self-improve

  • Erica Zhang (Stanford / Jump Trading) — evaluating LLMs when the domain isn't verifiable

  • Vignesh Baskaran (Hexo Labs) — the first public preview of SIA, a self-improving agent framework

Presentation Recordings are going up on our YouTube channel, @FrontierResearchClub.


​Hosted by

Frontier Syndicate is a private venture circle connecting frontier tech researchers, builders, and investors through curated convenings and early-stage capital. Across the Bay Area, we host a recurring series of research forums, builder nights, and intimate investor dinners — and back exceptional companies emerging from the labs, communities, and technical networks we convene.

Hexo Labs is a neolab for recursive self intelligence, building open agent systems that help scientific discovery take shape.
Hexo is working to make scientific discovery faster, more inspectable, and more capable of translating breakthrough ideas into real-world impact.

On May 21, Hexo will release SIA, its open-source self-improving agent framework: a system for agents that learn from experiments, evaluate their own progress, and refine their methods over time.

BASES (Business Association of Stanford Entrepreneurial Students) is one of the world's largest and most established student-run entrepreneurship organizations. Founded in 1996, it serves as the hub for student entrepreneurship at Stanford University, bridging the gap between academia, innovation, and industry.


Location Details

Jordan Hall (Psychology), Building 420, Room 041 — basement level.

Finding the room: Room 041 is in the basement, in the wing where the Psych building connects to the Math building (Building 380) in an L-shape. Easiest entry is through the courtyard. We will be outside guiding you until 6:30.

If you enter through Building 380 — take the elevator or stairs down to the basement, turn right, and Room 041 is straight ahead. You can also enter from the Main Quad / Psych side and head to the basement.

Parking: End of Palm Drive, around the Oval (free after 6:00pm; meter via ParkMobile before then). Overflow: along Palm Drive near Roth Way (free after 4pm).

Location
Please register to see the exact location of this event.
Stanford, CA
Avatar for Frontier Syndicate
Presented by
Frontier Syndicate