

From Reasoning to Real Time: Infrastructure for Modern Multimodal AI
Summary
As models become more agentic, multimodal, and context-hungry, latency and throughput increasingly determine what systems are actually buildable in production. From long-horizon “deep agents” that generate massive reasoning chains to expressive, interactive voice models that demand real-time responsiveness, next-gen AI workflows push inference infrastructure to its limits.
The first talk explores how accelerating deep-agent pipelines turns complex planning and multi-step execution from slow prototypes into deployable systems, while the second demonstrates how low-latency, emotionally rich voice generation unlocks fluid, multi-turn interaction. Together, they show how advances in speed and efficiency are reshaping what ML engineers can ship today.
Schedule
18:00 Doors open
18:30 - 19:00 Going Deeper, Going Faster: How SambaNova Unlocks the Potential of Deep Agents (Kwasi Ankomah)
19:00 - 19:30 Next-Gen Emotional Voice AI in Real Time with Hume AI and SambaNova (Masahiko Nakano)
19:30 - 20:00 TBD (Stefano Massaroli)
20:00 - 21:00 Networking
21:00 Event ends
Talks
Talk 1: Going Deeper, Going Faster: How SambaNova Unlocks the Potential of Deep Agents
Speaker: Kwasi Ankomah
Abstract: The AI industry is moving beyond simple tool-calling agents toward "deep agents"—systems capable of complex planning, multi-step execution, and sustained work across extended time horizons. Utilised by applications like Claude Code, Manus, and Deep Research, deep agents combine planning tools, sub-agent orchestration, file system access, and sophisticated prompts to tackle tasks that shallow agents simply cannot handle. But deep agents are token-hungry. They spawn sub-agents, maintain context across sessions, and generate extensive reasoning chains—making inference speed and efficiency critical bottlenecks. This talk explores how SambaNova's blazing-fast and efficient compute platform transforms what's possible with deep agents. We'll cover practical patterns for building production-grade deep agents and demonstrate how ultra-fast inference turns theoretical capabilities into real-world applications.
Bio: Kwasi Ankomah is a Lead AI Architect at SambaNova Systems, where he leads solution efforts on generative AI, large language models, and agentic AI applications. With a background spanning the UK’s Financial Conduct Authority and the consulting and financial sector, he brings deep expertise in applying AI to complex, regulated domains. He holds an MS in Data Science. Kwasi is passionate about AI leadership, diversity in tech, and responsible AI development. He’s a recognized voice in the AI infrastructure space, having appeared on podcasts like “The Neuron” and spoken at events including the AI Summit London, where he discusses why inference speed is the hidden bottleneck in scaling AI agents.
Talk 2: Next-Gen Emotional Voice AI in Real Time with Hume AI and SambaNova
Speaker: Masahiko Nakano
Abstract: Voice is becoming a key interface for next-generation AI, enabling more natural and emotionally expressive interactions. Hume AI, a New York–based startup, is leading this shift with two advanced speech models: Octave, an emotionally rich text-to-speech system, and EVI, a high-fidelity speech-to-speech model that transforms vocal style while preserving intent. These models support multilingual scenarios, including Japanese, and open new possibilities for enterprise applications. At the same time, voice AI faces a common challenge: the need for real-time, low-latency performance to support interactive, multi-turn voice agents. This talk will show how SambaNova’s accelerated platform helps enable these requirements and unlocks the full potential of Hume AI’s models, with a live demo of expressive, real-time voice generation.
Bio: Dr. Masahiko Nakano is a Principal Solutions Engineer at SambaNova, supporting Japanese enterprises in adopting advanced AI systems. He previously worked on digital transformation in the Japanese chemical industry and has a background in quantum computational chemistry.
Talk 3: TBD
Speaker: Stefano Massaroli
Abstract: TBD
Bio: Stefano Massaroli is the Co-founder and President of Radical Numerics Inc. and a Research Scientist at RIKEN’s Deep Learning Theory Team in Tokyo. Previously, he was a Founding Scientist at Liquid AI, where he led the launch and growth of Liquid AI Japan, the company’s first subsidiary, from inception. He also completed a postdoctoral fellowship at Mila, advised by Yoshua Bengio. Stefano co-invented hybrid convolution language models and helped pioneer neural differential equations. He earned his Master’s and PhD from the University of Tokyo.
Supporters
Tokyo AI (TAI) information
Tokyo AI (TAI) is the biggest AI community in Japan, with 2,400+ members mainly based in Tokyo (engineers, researchers, investors, product managers, and corporate innovation managers).
DEEPCORE information
DEEPCORE is a VC firm supporting AI Salon Tokyo. They operate a fund for seed and early-stage startups and KERNEL, a community supporting early entrepreneurs.