


TAI AMA #12 - AI x Voice
Summary
Voice AI is rapidly advancing, transforming how we interact with technology through speech. Yet, building systems that can truly understand, respond, and converse like humans remains a complex challenge. This event brings together researchers, engineers, and industry leaders to explore the current state of Voice AI, the roadblocks that remain, and the innovations shaping its future.
Through three focused talks, we will look at the technical hurdles in achieving natural, reliable voice assistants, the possibilities and limitations of open-ended conversational chatbots, and the latest breakthroughs in generating lifelike, emotionally expressive synthetic voices.
Schedule
18:00 Doors open
18:30 - 18:40 Community and sponsor introductions
18:45 - 19:10 Talk 1: What's yet to be solved in Voice AI
19:15 - 19:40 Talk 2: Towards Open-Ended Conversational Chatbots
19:45 - 20:10 Talk 3: AI Voice That Perfects Human Likeness
20:10 - 21:00 Networking
21:00 Event ends
Talks
Talk 1: What's yet to be solved in Voice AI
Speaker: Haris Gulzar (AI Researcher, NTT)
Abstract: While voice recognition and voice synthesis have achieved impressive performances. To build a reliable voice AI system that serves as a true and natural assistant, a lot of challenges are yet to be solved. Balancing latency with reasoning ability, long-term memory, and noisy scenarios are a few of the challenges that we will shed some light on in this presentation.
Bio: Haris has been working in the Voice AI domain in Tokyo for over 5 years. He and his team at NTT are pushing the boundaries of speech science. Haris has experience in voice research and product prototyping while working at NTT as an AI researcher. Recently, Haris has been tackling the challenge of building AI agents specifically for voice applications.
Talk 2: Towards Open-Ended Conversational Chatbots
Speaker: Francisco Soares (Founder, Furious Green)
Abstract: Most voice agents today are designed for narrow tasks: device control, setting reminders, or customer support automation. But what about small talk, the seemingly trivial conversations that make us human? In this talk, I will explore the current state of open-ended conversational chatbots, with a focus on the unique challenges of building systems that can sustain natural dialogue in Japanese.
Bio: Francisco Soares is the founder of Furious Green, an AI and Technology training company based in Yokohama. With over 15 years of experience as a software engineer and a background in NLP, he has worked at companies including Google and CyberAgent. In addition to leading training programs through Furious Green, he also personally advises startups and companies on building AI-driven products and strategies.
Talk 3: AI Voice That Perfects Human Likeness
Speaker: Takashi Hiraiwa (GTM, ElevenLabs)
Abstract: ElevenLabs is an AI voice service that provides developers with tools to generate realistic, human-like speech from text. Takashi will explain why developers should use it to easily integrate high-quality, emotionally nuanced, and multilingual voice capabilities into their applications, enhancing user experience and accessibility.
Bio: Takashi Hiraiwa joined ElevenLabs Japan as the fourth employee. In his previous role at DataRobot, he was responsible for launching the BDR (Business Development Representative) team and supporting the marketing team. With over two years of experience in the AI field, he is now focused on pioneering the market for ElevenLabs' hyper-realistic AI services.
Organizers
Ilya Kulyatin: Fintech and AI entrepreneur with work and academic experience in the US, Netherlands, Singapore, UK, and Japan, with an MSc in Machine Learning from UCL.
Haris Gulzar: Haris has been working in the Voice AI domain in Tokyo for over 5 years. He and his team at NTT are pushing the boundaries of speech science. Haris has experience in voice research and product prototyping while working at NTT as an AI researcher. Recently, Haris has been tackling the challenge of building AI agents specifically for voice applications.
Our Community
Tokyo AI (TAI)
TAI is the biggest AI community in Japan, with 2,900+ members mainly based in Tokyo (engineers, researchers, investors, product managers, and corporate innovation managers).
Web: https://www.tokyoai.jp/
Event Supporters
DEEPCORE is a VC firm supporting AI Salon Tokyo. They operate a fund for seed and early-stage startups and KERNEL, a community supporting early entrepreneurs.