

Beyond Evals: Post-Training Agents in the Sandbox Era
The Lowdown 👋
Agent workflows have made huge progress on evals and benchmarks, but getting them to reliably work in production is still a mess. What passes eval often breaks under real-world conditions: missing context, edge cases, latency constraints, permissions, and unpredictable user behavior.
This event focuses on what comes after evals: post-training in sandboxed environments. Instead of relying purely on static benchmarks, we look at how to build “training gyms” where agents can interact with realistic systems, generate traces, and iteratively improve. The goal is to close the loop between evals and post-training—so agents don’t just score well, but actually perform in production.
We’ll cover how teams are designing these sandbox environments, generating and validating training data, and aligning evals with real deployment conditions. We’ll also dig into practical questions like:
how to reduce over-reliance on frontier models
how to specialize agents with smaller or open models
how to pressure test for failures before rollout
Speakers will share perspectives from both research and applied systems, including work from SkyRL and Snowflake. Expect concrete examples, tradeoffs, and lessons learned from building agent systems that need to work outside of demos.
The Humans at the Mic 🎤
Karthik Ganesan, AI Research Scientist, Snowflake (Moderator)
Panelists to be announced soon!
The Rundown 🗓️
5:00PM: Welcome & Light Bites - Grab a drink, enjoy light refreshments, and connect with fellow attendees before the program begins.
5:30 PM: Panel Discussion with Q&A - A conversation about agent post-training, driven by audience questions!
6:30 PM: Community & Drinks - Continue the conversation with our panelists & guests. Light refreshments will be provided.
Is This Your Crowd? 👀
This event is designed for people actively building with AI. If you’re an AI/ML engineer, researcher, or working in applied AI systems, you’ll find the discussion directly relevant. We also welcome founders, developers, and technical operators exploring how to take agent workflows from prototype to production.
Whether you're iterating on evals, deploying agents in real-world environments, or thinking about post-training strategies, this is a room for practitioners focused on what actually works beyond the demo.
How to Get Here (and Get In) 🚗
Venue | Hosted at the Silicon Valley AI Hub at the Snowflake offices in Menlo Park
Location | 135 Constitution Dr., 8th Floor, Menlo Park, CA 94025
Parking | Free, convenient self-parking is available in the garage directly adjacent to the 135 Building
Arrival Guide | For seamless entry and navigation, please download our arrival guide
About the SVAI Hub ✨
The SVAI Hub is a dedicated community for the AI ecosystem to learn, collaborate, and push the boundaries of what’s possible. From hands-on workshops to expert fireside chats, we bring together startups, developers, and executives for high-signal programming in the heart of Silicon Valley. Our mission is simple: no pitches, no pretense, just AI builders focused on technical depth and a thriving community.