Cover Image for MLSG : LLMs in production : Inference, RAG, and video models
Cover Image for MLSG : LLMs in production : Inference, RAG, and video models
Avatar for Machine Learning Singapore
Meeting Calendar for the Machine Learning Singapore meet-up group (the largest AI group in Singapore)
157 Going

MLSG : LLMs in production : Inference, RAG, and video models

Registration
Welcome! To join the event, please register below.
About Event
  • LOCATION : Rakuten Offices, near Raffles Place

  • ENTRY : On the day of the event, we'll add the building's link here for an individual access QR code (for each person to fill in to get up to the event space)

  • FOOD : None (i.e. please eat before/after the event)

Talks:

"Efficient LLM Fine-tuning for Semantic Search" - Dongzhe Wang

In this talk, Dongzhe will explore how large language models (LLMs) can be efficiently fine-tuned to power semantic search systems, using parameter-efficient fine-tuning techniques. Attendees will learn about how LLMs are being applied to improve search in practice, along with the challenges of balancing quality and latency in real-world applications. Dongzhe is a Principal Research Scientist at Rakuten Asia, and obtained a Ph.D. from Nanyang Technological University before working at Shopee and Zhuiyi.

"Efficient Inference and Serving of LLMs and Large Video-Generative Models" - Jonathan Zhao

Jonathan will explore techniques for efficiently serving LLMs and large video generative models. The session will cover methods to optimize inference performance alongside system-level strategies for scalable deployment, highlighting key differences in serving the two model types in practice. Attendees will gain insights into approaches for improving inference and serving, considering balancing quality with latency and other real-world challenges. Jonathan is a Senior Software Engineer at Rakuten, having previously done AI product development at a startup.

"IPhO Gold using Agentic Gemini" - Martin Andrews

A recent paper showed that Gemini Pro 2.5, when driven in an agentic loop, could achieve Gold medal standard on the International Physics Olympiad theory questions. This comes hot on the heels of Google's internal version getting to Gold on the IMO (Mathematics). Martin will briefly talk about the IPhO, how the agentic system works, and show some of the actual questions (so you can see what's involved).


Talks will start at 7:00pm and end at around 8:45pm, at which point people normally come up to the front for a bit of a chat with each other, and the speakers.


HELP WANTED
MLSG needs a few volunteers to help with logistics (like checking people into the event). If you're willing to help, and want to give back to the community, please email Martin at reddragon.ai

Location
138 Market St, #20-01 CapitaGreen, Singapore 048946
Avatar for Machine Learning Singapore
Meeting Calendar for the Machine Learning Singapore meet-up group (the largest AI group in Singapore)
157 Going