Cover Image for The Hidden Layers of AI Systems: Memory, Retrieval, and Search
Cover Image for The Hidden Layers of AI Systems: Memory, Retrieval, and Search
Avatar for MLOps Community London
Welcome to the official MLOps Community London chapter 😎
Join us on our #london channel in the MLOps Community Slack: https://mlops.community/join/

The Hidden Layers of AI Systems: Memory, Retrieval, and Search

Register to See Address
London, United Kingdom
Registration
Event Full
If you’d like, you can join the waitlist.
Please click on the button below to join the waitlist. You will be notified if additional spots become available.
About Event

The Hidden Layers of AI Systems: Memory, Retrieval, and Search

Most AI systems don’t fail because of the model alone. They fail because of memory, retrieval, context, and evaluation.

Agents accumulate too much context and lose signal. RAG pipelines degrade when retrieval quality is poor. Search systems return results, but not always the right ones. Coding agents struggle to preserve intent across long-running sessions. And evaluating whether an AI system actually found the right information remains one of the hardest problems in production.

This MLOps Community London meetup, sponsored by Qdrant, explores what it takes to build reliable AI systems beyond the model layer.

We’ll hear from engineers working on real-world AI infrastructure, agent memory, retrieval systems, coding agents, and production AI workflows, followed by a speaker panel, audience Q&A, and time to connect with other builders in London.

Topics We’ll Explore

Agentic Memory Architecture

  • Why memory matters in production AI systems

  • Avoiding context bloat in long-running agents

  • When agents should retrieve, remember, or forget

  • Practical patterns for persistent memory systems

Retrieval and Context Quality

  • Why naive retrieval leads to brittle AI systems

  • How to preserve useful context across tasks and sessions

  • Vector search, hybrid retrieval, and structured specifications

  • Designing memory layers that scale beyond a single prompt

Evaluation and Reliability

  • How to detect drift, fake completion, and degraded context

  • Why “the agent finished” does not always mean the task is done

  • Evaluating retrieval quality beyond surface-level metrics

  • Designing AI systems that are easier to inspect, debug, and trust

AI-Native Engineering Workflows

  • How coding agents use context, tools, and memory

  • Building harnesses for long-running agent tasks

  • Declarative context architecture vs retrieval-heavy workflows

  • What reliable autonomous engineering systems might look like


Agenda

5:30 – 6:15 PM | Doors Open: Arrival & Networking

Check in, grab food and drinks, and meet other engineers, founders, and builders working on AI systems.

6:15 – 6:25 PM | Welcome & Kickoff

Opening remarks from MLOps Community London, including a special community announcement.

6:25 – 6:50 PM | Grow the Data, Shrink the Bill

Agent efficiency through vector search.

AI agents waste tokens in four common ways: context accumulation, payload inefficiency, generative waste, and failure recovery. Stuffing the context window with everything can work, but it is often 50 to 100x more expensive than targeted retrieval.

This talk explores how vector embeddings can increase information density so an orchestrator works only with retrieved context from a vector store. Ewa will share a benchmark experiment across SWE-Bench, LongMemEval, and FinanceBench, comparing agentic grep, Qdrant BM25 sparse retrieval, and Qdrant dense retrieval, with tokens-to-resolution as the primary metric.

🎤 Ewa Szyszka, DevRel Engineer at Qdrant

Ewa Szyszka is a DevRel Engineer at Qdrant, focused on technical education, developer community, and applied AI systems. She has worked across GenAI evaluations, multi-agent systems, AI gateways, observability, and RAG, with previous DevRel roles at orq.ai and Kilo Code. She has also organised large technical meetups and built tools across cloud, data, and AI workflows.

6:50 – 7:10 PM | Building Your Own Memory Layer for Coding Agents

Coding agents struggle with long-running context, repository-scale retrieval, and maintaining architectural intent across sessions.

This talk explores practical patterns for designing memory systems using vector search, hybrid retrieval, and structured specifications. Shashi will share lessons from building a spec-driven external memory architecture with Qdrant and SpecMem, along with practical guidance for developers building scalable memory layers for their own AI coding agents.

🎤 Shashi Jagtap, Founder at Superagentic AI

Shashi Jagtap is the founder of Superagentic AI, where he builds developer tools and frameworks for production-grade agentic AI systems. He previously spent nearly six years at Apple working on Xcode and developer tooling, and has close to 20 years of experience across DevEx, automation, mobile infrastructure, and agent engineering. He also organises the London Agentic AI community.

7:10 – 7:20 PM | Break 🍕🍺

Refuel, connect, and continue the conversation.

7:20 – 7:40 PM | Context, Environment, Knowledge: Three Layers on the Road to Autonomous Engineering

Model evolution is a catalyst, not the cure.

This talk covers memory patterns that actually work for long-running, multi-agent tasks: how to avoid drift, how to catch “fake done”, and how context, environment, and knowledge become building blocks on the road to a software factory.

🎤 Dmytro Yaroshenko, Applied AI Engineer at Factory AI

Dmytro Yaroshenko is an Applied AI Engineer at Factory AI, working on practical AI systems for autonomous software engineering. He previously worked at Contextual AI, Snowflake, and Deutsche Bank, with experience spanning applied AI, data engineering, solution architecture, and strategic enterprise AI adoption. His work focuses on moving AI systems from experimentation into reliable production workflows.

7:40 – 8:00 PM | Stop Retrieving. Start Declaring.

A codebase is already a memory system. Every file, folder structure, and architectural decision is context that agents consistently fail to read.

This talk makes the case that retrieval is often a workaround for missing structure, and introduces the declarative context architecture pattern: scoped context files, task blueprints, and a stateful memory harness that carries state across sessions and agent boundaries.

You’ll leave with a clear framework for when to declare versus when to retrieve, and a harness architecture you can build against immediately.

🎤 Talha Sheikh, AI Engineer at Checkout.com

Talha Sheikh is an AI Software Engineer at Checkout.com, where he works on scalable AI adoption, platform infrastructure, and production AI strategy. He previously worked as a software engineer at TrueLayer and American Express, building financial systems and backend services. Outside work, he is building Vector, a harness designed to help AI agents finish what they start.

8:00 – 8:35 PM | Panel Discussion + Audience Q&A

Join all speakers for an open discussion on memory, retrieval, search, coding agents, and the hidden infrastructure behind reliable AI systems.

Bring your questions.

8:35 – 9:00 PM | Networking

Continue the conversation with speakers, attendees, and other builders working on production AI systems.

9:00 PM onwards | Social

Optional post-event social nearby. Location TBC.


Who Should Attend

  • AI / ML engineers

  • Backend and data engineers

  • Engineers building RAG or agent systems

  • Engineers working on search, retrieval, memory, or evaluation

  • Technical founders and product builders

  • Anyone interested in production AI infrastructure

Please only register if you plan to attend, as capacity is limited.


Sponsored by Qdrant

Qdrant is a high-performance vector database and search engine for building AI applications with advanced retrieval, filtering, and similarity search.

MLOps Community London brings together engineers, researchers, founders, and practitioners building real-world machine learning and AI systems.

Location
Please register to see the exact location of this event.
London, United Kingdom
Avatar for MLOps Community London
Welcome to the official MLOps Community London chapter 😎
Join us on our #london channel in the MLOps Community Slack: https://mlops.community/join/