Cover Image for High-Performance AI Inference: Systems, Caching, and Distributed Execution
Cover Image for High-Performance AI Inference: Systems, Caching, and Distributed Execution
Avatar for Tokyo AI (TAI)
Presented by
Tokyo AI (TAI)
Hosted By

High-Performance AI Inference: Systems, Caching, and Distributed Execution

Register to See Address
Tokyo
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

This session dives into the low-level systems challenges behind running large language models efficiently in production. Rather than focusing on model architecture or prompts, the talks explore how inference performance is shaped by infrastructure decisions: cache design, memory movement, distributed execution, and latency optimization at scale.


Agenda

18:00 Doors open

18:30 - 20:00 TBD

20:00 - 21:00 Networking & Food & Drinks

21:00 Doors close

Organizers

​​​​Ilya Kulyatin: Fintech and AI entrepreneur with work and academic experience in the US, Netherlands, Singapore, UK, and Japan, with an MSc in Machine Learning from UCL.

Supporters

​​Tokyo AI (​​​TAI) is the biggest AI community in Japan, with 4,000+ members mainly based in Tokyo (engineers, researchers, investors, product managers, and corporate innovation managers).

Value Create is a management advisory and corporate value design firm offering services such as business consulting, education, corporate communications, and investment support to help companies and individuals unlock their full potential and drive sustainable growth.

​Privacy Policy

​We will process your email address for the purposes of event-related communications and ongoing newsletter communications. You may unsubscribe from the newsletter at any time. Further details on how we process personal data are available in our Privacy Policy.

Location
Please register to see the exact location of this event.
Tokyo
Avatar for Tokyo AI (TAI)
Presented by
Tokyo AI (TAI)
Hosted By