

Agents, judges, and legacy code.
AI agents are getting better at writing software.
That doesn’t mean the software is getting better.
As AI systems become more autonomous, engineering teams face a different set of problems: unreliable outputs, untestable workflows, evaluation bottlenecks, and legacy systems that weren’t designed for any of this.
At this Future Form session, we’re looking at what happens when agentic systems collide with production engineering reality.
Speakers:
Ning Lu, VP, Aladdin Financial Engineering, BlackRock
Ning explores the evolving landscape of LLM evaluation. From static benchmarks to LLM-as-judge systems and agent trajectory review, he’ll break down how engineering teams can evaluate systems that don’t behave deterministically.
Katie Roberts, Technical Director & AI Native Specialist, Nearform
Katie will show how AI-native engineering can be applied inside mature brownfield systems. Using patterns like the Strangler Fig approach, she’ll explore how to isolate risk, generate tests for undocumented code, and evolve legacy architectures without full rewrites.
If you’re working with AI agents, legacy systems, or production engineering workflows, this session is for you.
Pizza, drinks, and practical AI insights included.