Cover Image for Meetup:  Multiple Tactics To Lower Your Inference Costs
Cover Image for Meetup:  Multiple Tactics To Lower Your Inference Costs
Avatar for Neurometric AI Events

Meetup:  Multiple Tactics To Lower Your Inference Costs

Registration
Welcome! To join the event, please register below.
About Event

Small Language Models (SLMs) are quietly reshaping how production AI systems are built.

While frontier models dominate headlines, real-world products increasingly rely on smaller, faster, cheaper models for routing, classification, extraction, and decision-making inside live workflows.

This in-person meetup brings together developers, founders, and AI builders who are deploying SLMs in production systems.

We’ll explore:

  • When to use SLMs vs large models

  • Designing multi-model routing architectures

  • Cost, latency, and reliability tradeoffs

  • Fine-tuning vs prompting vs structured outputs

  • Building eval pipelines for production reliability

  • Real-world case studies from teams shipping today

Whether you're building AI assistants, workflow automation tools, vertical SaaS, or infra products, this session will focus on practical implementation — not hype.

Expect technical depth, live architecture discussions, and candid lessons from builders who care about performance, cost, and real deployment constraints.

Come ready to share what you’re building.

Location
West 38th Street
W 38th St, New York, NY 10018, USA
Avatar for Neurometric AI Events