Cover Image for From Reasoning to Real Time: Infrastructure for Modern Multimodal AI
Cover Image for From Reasoning to Real Time: Infrastructure for Modern Multimodal AI
Avatar for Tokyo AI (TAI)
Presented by
Tokyo AI (TAI)
Hosted By

From Reasoning to Real Time: Infrastructure for Modern Multimodal AI

Register to See Address
Bunkyo City, Tokyo
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

Summary

As models become more agentic, multimodal, and context-hungry, latency and throughput increasingly determine what systems are actually buildable in production. From long-horizon “deep agents” that generate massive reasoning chains to expressive, interactive voice models that demand real-time responsiveness, next-gen AI workflows push inference infrastructure to its limits.

The first talk explores how accelerating deep-agent pipelines turns complex planning and multi-step execution from slow prototypes into deployable systems, while the second demonstrates how low-latency, emotionally rich voice generation unlocks fluid, multi-turn interaction. Together, they show how advances in speed and efficiency are reshaping what ML engineers can ship today.

Schedule

18:00 Doors open

18:30 - 19:00 Going Deeper, Going Faster: How SambaNova Unlocks the Potential of Deep Agents (Kwasi Ankomah)

19:00 - 19:30 Next-Gen Emotional Voice AI in Real Time with Hume AI and SambaNova (Masahiko Nakano)

19:30 - 20:00 TBD (Stefano Massaroli)

20:00 - 21:00 Networking

21:00 Event ends

Talks

Talk 1: Going Deeper, Going Faster: How SambaNova Unlocks the Potential of Deep Agents

Speaker: Kwasi Ankomah

Abstract: The AI industry is moving beyond simple tool-calling agents toward "deep agents"—systems capable of complex planning, multi-step execution, and sustained work across extended time horizons. Utilised by applications like Claude Code, Manus, and Deep Research, deep agents combine planning tools, sub-agent orchestration, file system access, and sophisticated prompts to tackle tasks that shallow agents simply cannot handle. But deep agents are token-hungry. They spawn sub-agents, maintain context across sessions, and generate extensive reasoning chains—making inference speed and efficiency critical bottlenecks. This talk explores how SambaNova's blazing-fast and efficient compute platform transforms what's possible with deep agents. We'll cover practical patterns for building production-grade deep agents and demonstrate how ultra-fast inference turns theoretical capabilities into real-world applications.

Bio: Kwasi Ankomah is a Lead AI Architect at SambaNova Systems, where he leads solution efforts on generative AI, large language models, and agentic AI applications. With a background spanning the UK’s Financial Conduct Authority and the consulting and financial sector, he brings deep expertise in applying AI to complex, regulated domains. He holds an MS in Data Science. Kwasi is passionate about AI leadership, diversity in tech, and responsible AI development. He’s a recognized voice in the AI infrastructure space, having appeared on podcasts like “The Neuron” and spoken at events including the AI Summit London, where he discusses why inference speed is the hidden bottleneck in scaling AI agents.

Talk 2: Next-Gen Emotional Voice AI in Real Time with Hume AI and SambaNova

Speaker: Masahiko Nakano

Abstract: Voice is becoming a key interface for next-generation AI, enabling more natural and emotionally expressive interactions. Hume AI, a New York–based startup, is leading this shift with two advanced speech models: Octave, an emotionally rich text-to-speech system, and EVI, a high-fidelity speech-to-speech model that transforms vocal style while preserving intent. These models support multilingual scenarios, including Japanese, and open new possibilities for enterprise applications. At the same time, voice AI faces a common challenge: the need for real-time, low-latency performance to support interactive, multi-turn voice agents. This talk will show how SambaNova’s accelerated platform helps enable these requirements and unlocks the full potential of Hume AI’s models, with a live demo of expressive, real-time voice generation.

Bio: Dr. Masahiko Nakano is a Principal Solutions Engineer at SambaNova, supporting Japanese enterprises in adopting advanced AI systems. He previously worked on digital transformation in the Japanese chemical industry and has a background in quantum computational chemistry.

Talk 3: TBD

Speaker: Stefano Massaroli

Abstract: TBD

Bio: Stefano Massaroli is the Co-founder and President of Radical Numerics Inc. and a Research Scientist at RIKEN’s Deep Learning Theory Team in Tokyo. Previously, he was a Founding Scientist at Liquid AI, where he led the launch and growth of Liquid AI Japan, the company’s first subsidiary, from inception. He also completed a postdoctoral fellowship at Mila, advised by Yoshua Bengio. Stefano co-invented hybrid convolution language models and helped pioneer neural differential equations. He earned his Master’s and PhD from the University of Tokyo.

Supporters

​​​​​​​Tokyo AI (TAI) information

Tokyo AI (​​​TAI) is the biggest AI community in Japan, with 2,400+ members mainly based in Tokyo (engineers, researchers, investors, product managers, and corporate innovation managers).

​​​​​​​​DEEPCORE information

​​DEEPCORE is a VC firm supporting AI Salon Tokyo. They operate a fund for seed and early-stage startups and KERNEL, a community supporting early entrepreneurs.

Location
Please register to see the exact location of this event.
Bunkyo City, Tokyo
Avatar for Tokyo AI (TAI)
Presented by
Tokyo AI (TAI)
Hosted By