#AIWeekNY: From Invisible Drift to Real Damage:Why Your LLM Is Quietly Breaking

Hosted by Yvette Schmitter & Vlad Shifrin

Virtual

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

The Problem You're Here to Solve

Your LLM passed every evaluation before deployment. Accuracy benchmarks looked great. Bias testing passed. Compliance validation checked out. You rolled it to production with confidence.

Three months later, customer complaints are up. Your support team is catching more hallucinations. Outputs that used to be consistent now vary wildly. Something changed, but your monitoring tools show green across the board.

Here's what actually happened: Your LLM didn't fail. It drifted.

Not all at once. Not catastrophically. Not in ways your deployment tests could catch. Small shifts in response patterns. Subtle changes in confidence thresholds. Gradual degradation in output quality. Incremental bias creep that compounds over thousands of interactions.

Your monitoring tools were built to catch the LLM falling over. They missed the LLM slowly leaning until it finally tipped.

What You'll Learn

This session destroys the myth that LLM monitoring means watching for crashes. The Shapira research on AI system failures shows what enterprise teams are learning the hard way: catastrophic failures make headlines, but drift destroys value silently.

You'll discover why traditional monitoring fails completely at detecting drift. Your performance dashboards track uptime. Your logging captures errors. Your alerts fire when systems break. None of that catches an LLM that's still running, still responding, still generating outputs - just worse outputs than yesterday.

You'll see the actual drift patterns playing out in production LLMs right now. Response quality degrading by fractions of a percent daily until you've lost 20% accuracy over three months. Bias in outputs shifting so gradually that no single interaction triggers an alert, but the cumulative pattern violates your compliance requirements. Hallucination rates climbing from 2% to 8% while your error logs stay clean because technically, the LLM didn't error - it just confidently generated garbage.

And you'll learn why detecting small drift before it becomes visible damage is the only viable control strategy.

//

This event is part of #AIWeekNY by Pulse NYC -- a community-led festival celebrating innovation across the AI ecosystem. More at https://pulse.nyc/ai-week/

Hosted By

75 Went