Cover Image for 🧠 The AI Collective, MOX SF | Beyond Benchmarks: From Leaderboards to Real ROI—Selecting LLMs That Fit Your Use Case

Presented by

The world’s largest AI community. Uniting 200k+ pioneers across 100+ global forums. Building the human layer for the AI era.

Hosted By

Featured in

Bond AI - San Francisco and Bay Area

🧠 The AI Collective, MOX SF | Beyond Benchmarks: From Leaderboards to Real ROI—Selecting LLMs That Fit Your Use Case

Name: 🧠 The AI Collective, MOX SF | Beyond Benchmarks: From Leaderboards to Real ROI—Selecting LLMs That Fit Your Use Case
Start: 2025-12-10T18:00:00.000-08:00
End: 2025-12-10T20:30:00.000-08:00
Location: San Francisco, California

The AI Collective

San Francisco, California

Past Event

Welcome! To join the event, please register below.

About Event

Are you choosing your LLMs based on benchmark leaderboards? Has this failed you? There is a reason.

Every week, new models top the charts on SWE-Bench, MMLU, and HumanEval. But here's the uncomfortable truth: these benchmarks weren't designed for your use case, your data, your workflows, or your success criteria.

Join us for a provocative discussion on why the industry's most celebrated benchmarks may mislead enterprises into poor model selection decisions—and what you should be doing instead.

Your Hosts:

Sinan Ozdemir: serial entrepreneur, author and advisor

Kevin Miao: Senior ML research engineer, Apple

What You'll Learn:

• The Disconnect: Why public benchmarks optimize for academic problems, not business outcomes

• Hidden Costs: How choosing models based on leaderboard rankings leads to failed implementations and wasted resources

• The Custom Benchmark Imperative: Why every organization needs evaluation frameworks tailored to their specific domain, data, and objectives

• Practical Framework: guidance for building internal benchmarks that actually predict model performance in production

• Real Case Studies: Examples from organizations that shifted from public benchmarks to custom evaluation—and the dramatic differences they discovered

Who Should Attend:

Engineering leaders, AI/ML practitioners, product managers, and technical decision-makers responsible for selecting and implementing LLMs in enterprise environments.

As always, food and drink, engaging conversation, and incredible company will all be provided!

Sponsored by: revela.io

Revela is an AI dev shop working exclusively with the next generation of high impact SF startups. They're deeply technical and obsessed with your success. Revela embeds within your team to bridge the gap from MVP to production grade AI through model finetuning, pruning, and robust infra.

Please be advised: Unfortunately, space is very limited at these community events and we can not always accept everyone we would like to. If you are not accepted to this event, please keep applying! We appreciate your application tremendously and we are looking forward to seeing you at a future event very soon!

The AI Collective is a global non-profit community uniting 100,000+ pioneers – founders, researchers, operators, and investors – exploring the frontier of AI in major tech hubs worldwide. Through events, workshops, and community-led research, we empower the AI ecosystem to collaboratively steer AI’s future toward trust, openness, and human flourishing.

All attendees and organizers at events affiliated with The AI Collective agree to our privacy policy and are subject to our code of conduct.

Location

Please register to see the exact location of this event.

San Francisco, California

Presented by

The AI Collective

The world’s largest AI community. Uniting 200k+ pioneers across 100+ global forums. Building the human layer for the AI era.

Hosted By