

Arize Builders Meetup - NYC - Boosting Claude Code performance with prompt learning
Your CLAUDE.md file is more powerful than you think. In this talk, we'll walk through how we applied Prompt Learning — an RL-inspired prompt optimization technique — to improve Claude Code's performance on SWE-Bench Lite by up to 11%, purely by optimizing the system prompt instructions. No fine-tuning, no new tools, no architecture changes. Just better prompts, driven by real performance data and LLM-as-a-judge feedback.
We'll cover the full optimization loop (rollouts → LLM evals → meta-prompting), show results for both general coding improvement and repo-specific specialization, and share practical takeaways you can apply to your own coding agent workflows today.
Agenda
6:00 - 6:30 PM | Check-in & Networking
6:30 - 7:00 PM | Laurie Voss, Arize AI - Boosting Claude Code performance with prompt learning
7:00 - 7:20 PM | Aydrian Howard, Auth0 - Trust, but Verify: Identity and Observability for AI Agents: AI agents that take real actions (reading email, making purchases, querying private documents) need more than a system prompt to be safe. In this session, we'll look at how identity management and observability tooling work together to make AI agents trustworthy: ensuring agents only act within authorized boundaries, and giving you full visibility into every decision they make.
7:20 - 7:50 PM | Paul Butler, Modal AI - Sandboxes Hot Takes: Paul will discuss some of the surprising take-aways from four years of working on agent sandboxes.
7:50 - 8:30 PM | Networking
Food and drinks will be provided. Space is limited—register soon to secure your spot!