Cover Image for The Hidden Layers of AI Systems: Memory, Retrieval, and Search

Presented by

MLOps Community London

Welcome to the official MLOps Community London chapter 😎

Join us on our #london channel in the MLOps Community Slack: https://mlops.community/join/

Hosted By

The Hidden Layers of AI Systems: Memory, Retrieval, and Search

Name: The Hidden Layers of AI Systems: Memory, Retrieval, and Search
Start: 2026-06-04T17:30:00.000+01:00
End: 2026-06-04T21:00:00.000+01:00
Location: London, United Kingdom

MLOps Community London

London, United Kingdom

Event Full

If you’d like, you can join the waitlist.

Please click on the button below to join the waitlist. You will be notified if additional spots become available.

You will be asked to verify token ownership with your wallet.

About Event

The Hidden Layers of AI Systems: Memory, Retrieval, and Search

Most AI systems don’t fail because of the model alone. They fail because of memory, retrieval, context, and evaluation.

Agents accumulate too much context and lose signal. RAG pipelines degrade when retrieval quality is poor. Search systems return results, but not always the right ones. Coding agents struggle to preserve intent across long-running sessions. And evaluating whether an AI system actually found the right information remains one of the hardest problems in production.

This MLOps Community London meetup, sponsored by Qdrant, explores what it takes to build reliable AI systems beyond the model layer.

We’ll hear from engineers working on real-world AI infrastructure, agent memory, retrieval systems, coding agents, and production AI workflows, followed by a speaker panel, audience Q&A, and time to connect with other builders in London.

Topics We’ll Explore

Agentic Memory Architecture

Why memory matters in production AI systems
Avoiding context bloat in long-running agents
When agents should retrieve, remember, or forget
Practical patterns for persistent memory systems

Retrieval and Context Quality

Why naive retrieval leads to brittle AI systems
How to preserve useful context across tasks and sessions
Vector search, hybrid retrieval, and structured specifications
Designing memory layers that scale beyond a single prompt

Evaluation and Reliability

How to detect drift, fake completion, and degraded context
Why “the agent finished” does not always mean the task is done
Evaluating retrieval quality beyond surface-level metrics
Designing AI systems that are easier to inspect, debug, and trust

AI-Native Engineering Workflows

How coding agents use context, tools, and memory
Building harnesses for long-running agent tasks
Declarative context architecture vs retrieval-heavy workflows
What reliable autonomous engineering systems might look like

Agenda

5:30 – 6:15 PM | Doors Open: Arrival & Networking

Check in, grab food and drinks, and meet other engineers, founders, and builders working on AI systems.

6:15 – 6:25 PM | Welcome & Kickoff

Opening remarks from MLOps Community London, including a special community announcement.

6:25 – 6:50 PM | Grow the Data, Shrink the Bill

Agent efficiency through vector search.

AI agents waste tokens in four common ways: context accumulation, payload inefficiency, generative waste, and failure recovery. Stuffing the context window with everything can work, but it is often 50 to 100x more expensive than targeted retrieval.

This talk explores how vector embeddings can increase information density so an orchestrator works only with retrieved context from a vector store. Ewa will share a benchmark experiment across SWE-Bench, LongMemEval, and FinanceBench, comparing agentic grep, Qdrant BM25 sparse retrieval, and Qdrant dense retrieval, with tokens-to-resolution as the primary metric.

🎤 Ewa Szyszka, DevRel Engineer at Qdrant

Ewa Szyszka is a DevRel Engineer at Qdrant, focused on technical education, developer community, and applied AI systems. She has worked across GenAI evaluations, multi-agent systems, AI gateways, observability, and RAG, with previous DevRel roles at orq.ai and Kilo Code. She has also organised large technical meetups and built tools across cloud, data, and AI workflows.

6:50 – 7:10 PM | Building Your Own Memory Layer for Coding Agents

Coding agents struggle with long-running context, repository-scale retrieval, and maintaining architectural intent across sessions.

This talk explores practical patterns for designing memory systems using vector search, hybrid retrieval, and structured specifications. Shashi will share lessons from building a spec-driven external memory architecture with Qdrant and SpecMem, along with practical guidance for developers building scalable memory layers for their own AI coding agents.

🎤 Shashi Jagtap, Founder at Superagentic AI

Shashi Jagtap is the founder of Superagentic AI, where he builds developer tools and frameworks for production-grade agentic AI systems. He previously spent nearly six years at Apple working on Xcode and developer tooling, and has close to 20 years of experience across DevEx, automation, mobile infrastructure, and agent engineering. He also organises the London Agentic AI community.

7:10 – 7:20 PM | Break 🍕🍺

Refuel, connect, and continue the conversation.

7:20 – 7:40 PM | Context, Environment, Knowledge: Three Layers on the Road to Autonomous Engineering

Model evolution is a catalyst, not the cure.

This talk covers memory patterns that actually work for long-running, multi-agent tasks: how to avoid drift, how to catch “fake done”, and how context, environment, and knowledge become building blocks on the road to a software factory.

🎤 Dmytro Yaroshenko, Applied AI Engineer at Factory AI

Dmytro Yaroshenko is an Applied AI Engineer at Factory AI, working on practical AI systems for autonomous software engineering. He previously worked at Contextual AI, Snowflake, and Deutsche Bank, with experience spanning applied AI, data engineering, solution architecture, and strategic enterprise AI adoption. His work focuses on moving AI systems from experimentation into reliable production workflows.

7:40 – 8:00 PM | Stop Retrieving. Start Declaring.

A codebase is already a memory system. Every file, folder structure, and architectural decision is context that agents consistently fail to read.

This talk makes the case that retrieval is often a workaround for missing structure, and introduces the declarative context architecture pattern: scoped context files, task blueprints, and a stateful memory harness that carries state across sessions and agent boundaries.

You’ll leave with a clear framework for when to declare versus when to retrieve, and a harness architecture you can build against immediately.

🎤 Talha Sheikh, AI Engineer at Checkout.com

Talha Sheikh is an AI Software Engineer at Checkout.com, where he works on scalable AI adoption, platform infrastructure, and production AI strategy. He previously worked as a software engineer at TrueLayer and American Express, building financial systems and backend services. Outside work, he is building Vector, a harness designed to help AI agents finish what they start.

8:00 – 8:35 PM | Panel Discussion + Audience Q&A

Join all speakers for an open discussion on memory, retrieval, search, coding agents, and the hidden infrastructure behind reliable AI systems.

Bring your questions.

8:35 – 9:00 PM | Networking

Continue the conversation with speakers, attendees, and other builders working on production AI systems.

9:00 PM onwards | Social

Optional post-event social nearby. Location TBC.

Who Should Attend

AI / ML engineers
Backend and data engineers
Engineers building RAG or agent systems
Engineers working on search, retrieval, memory, or evaluation
Technical founders and product builders
Anyone interested in production AI infrastructure

Please only register if you plan to attend, as capacity is limited.

The Hidden Layers of AI Systems: Memory, Retrieval, and Search

​The Hidden Layers of AI Systems: Memory, Retrieval, and Search

​Topics We’ll Explore

​Agenda

​5:30 – 6:15 PM | Doors Open: Arrival & Networking

​6:15 – 6:25 PM | Welcome & Kickoff

​6:25 – 6:50 PM | Grow the Data, Shrink the Bill

​6:50 – 7:10 PM | Building Your Own Memory Layer for Coding Agents

​7:10 – 7:20 PM | Break 🍕🍺

​7:20 – 7:40 PM | Context, Environment, Knowledge: Three Layers on the Road to Autonomous Engineering

​7:40 – 8:00 PM | Stop Retrieving. Start Declaring.

​8:00 – 8:35 PM | Panel Discussion + Audience Q&A

​8:35 – 9:00 PM | Networking

​9:00 PM onwards | Social

​Who Should Attend

​Sponsored by Qdrant

The Hidden Layers of AI Systems: Memory, Retrieval, and Search

Topics We’ll Explore

Agenda

5:30 – 6:15 PM | Doors Open: Arrival & Networking

6:15 – 6:25 PM | Welcome & Kickoff

6:25 – 6:50 PM | Grow the Data, Shrink the Bill

6:50 – 7:10 PM | Building Your Own Memory Layer for Coding Agents

7:10 – 7:20 PM | Break 🍕🍺

7:20 – 7:40 PM | Context, Environment, Knowledge: Three Layers on the Road to Autonomous Engineering

7:40 – 8:00 PM | Stop Retrieving. Start Declaring.

8:00 – 8:35 PM | Panel Discussion + Audience Q&A

8:35 – 9:00 PM | Networking

9:00 PM onwards | Social

Who Should Attend

Sponsored by Qdrant