Cover Image for Inference Performance as a Competitive Advantage
Cover Image for Inference Performance as a Competitive Advantage
Avatar for Future AGI
Presented by
Future AGI
AI Engineering and Optimization Platform
Hosted By

Inference Performance as a Competitive Advantage

Zoom
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

Overview

This session provides an in-depth introduction to FriendliAI and explores how optimized AI inference can become a strategic differentiator for businesses deploying generative AI at scale. Attendees will learn about the critical role inference performance plays in production AI systems where 80-90% of GPU resources are dedicated and discover practical techniques for achieving faster response times, lower costs, and seamless scalability.

Session Agenda:

  1. Introduction to FriendliAI and the AI Inference Landscape

  2. Why Inference Performance Matters: Speed, Cost, and Scale

  3. Demonstration of the FriendliAI Suite

  4. Real-World Use Cases and Customer Success Stories

  5. Q&A and Discussion

Key Takeaways / Learning Outcomes

Attendees will walk away with:

  • A clear understanding of why inference optimization is critical for production AI applications

  • Knowledge of techniques that can reduce inference costs by up to 90% while boosting response times

  • Insights into how continuous batching, speculative decoding, and smart caching accelerate LLM serving

  • Practical guidance on deploying high-performance inference infrastructure at scale

  • Understanding of how to turn inference performance into a competitive business advantage

Who should join?

This session is designed for ML/AI Engineers, MLOps practitioners, and technical teams building and deploying generative AI applications in production environments.

Speakers

Speaker 1: Yunmo Koo (Founding Engineer, FriendliAI)

Speaker 2: Alex Campos (GTM Leader, FriendliAI)

Moderator: Rishav Hada (Applied Scientist, Future AGI)

About FriendliAI

FriendliAI is a generative AI infrastructure company founded in 2021, specializing in high-performance LLM inference. Their flagship product, Friendli Inference Engine, delivers up to 90% cost reduction and 2x+ faster inference through proprietary optimizations including continuous batching (pioneered in their OSDI 2022 Orca paper), speculative decoding, and custom GPU kernels.

About Future AGI

​​​​Future AGI is a San Francisco-based advanced AI Engineering & optimization platform designed to streamline experimentation, evaluation, optimization and real-time observability. Traditional AI tools often rely on guesswork due to gaps in data generation, error analysis, and feedback loops. Future AGI eliminates this uncertainty by automating the data layer with multi-modal evaluations, agent optimisations, observability, and synthetic data tools, cutting AI development time by up to 95%.

​​​​🌐 Follow us on LinkedIn to get the latest updates on events and new launches.

Avatar for Future AGI
Presented by
Future AGI
AI Engineering and Optimization Platform
Hosted By