The Toronto LLM Meetup Group - Rethinking Tokenization
Welcome to the 4th session of the 4th edition of the Toronto LLM Meetup Group. We really appreciate the growing interest in our events and will continue to ensure equitable access for all participants. This time, we are rethinking tokenization and emergent research in this direction.
Even in the era of rapid fire progress in language models, the field of tokenization has long resisted disruption. We believe we are finally seeing some progress on this front, and are dedicating this session to cover promising new techniques. We will also feature some fascinating demos that will make you rethink the role and importance of tokenization and vocabulary design in language models.
Special thanks to Georgian for hosting and Hudson Labs for their support and resources.
📖 Papers & Resources
It is not necessary to read them all before the session, but browsing through them might help you more actively participate in the discussion session.
https://arxiv.org/pdf/2507.07955 - Dynamic Chunking
https://arxiv.org/abs/2412.09871 - Byte Latent Transformer
https://arxiv.org/abs/2507.12720 - FLEXITOKENS
⏱️ Timetable
17:30 - 18:15: Arrivals & Networking
18:15 - 18:40: Speaker Session #1
18:40 - 19:00: Discussion/Q&A
19:00 - 19:10: Short Break
19:10 - 19:35: Speaker Session #2
19:35 - 20:00: Discussion/Q&A
20:00 - 20:45: Networking & Food
📣 Speaker Bios
Suhas Pai is an NLP researcher and the co-founder/CTO of Hudson Labs, a Toronto-based Y Combinator-backed startup. He authored Designing Large Language Model Applications (O’Reilly Media) and led/contributed to the development of several open-source LLMs. Suhas has chaired the Toronto Machine Learning Summit since 2021 and regularly speaks at AI conferences and seminars on NLP research.
More coming soon...
👋 About Us
The Toronto LLM Meetup group is founded by Suhas Pai, CTO and Co-founder of Hudson Labs, and Lily Ren, Senior PM, B2B AI at Telus. The aim of this group is to facilitate deep dives into LLM research topics, harnessing the skills of Toronto's vibrant AI ecosystem and promoting intellectual rigor. To this end, for each session we assemble a team of top Toronto AI scientists and practitioners to conduct an in-depth study of an emerging topic that is expected to go mainstream 6 months from now. The team presents their findings along with jumping-off points for future research at a meetup session, held at Toronto's top AI companies. Eventually we aim to spawn off community open-source research projects this way.
This session is co-organized with Georgian. Georgian is a growth-stage investor that builds software to help their portfolio companies scale faster. Georgian also organizes ML events through Transferred Learnings, Georgian's technical community of AI builders and founders. The community content is focused on Applied AI, and initiatives span a range of Applied AI activities, such as workshops, bootcamps, virtual educational sessions and in-person events.
📧 Mailing List
Thank you so much for your interest. We’re thrilled to see so many of you sign up for this event! We’ll be collecting emails to invite you to future editions of the meetup, but if you’d prefer not to be on the mailing list, just let us know.
