Cover Image for MLSG : LLMs in production : Inference, RAG, and video models

Presented by

Machine Learning Singapore

Meeting Calendar for the Machine Learning Singapore meet-up group (the largest AI group in Singapore)

Hosted By

223 Went

MLSG : LLMs in production : Inference, RAG, and video models

Name: MLSG : LLMs in production : Inference, RAG, and video models
Start: 2025-09-25T18:45:00.000+08:00
End: 2025-09-25T21:15:00.000+08:00
Location: 138 Market St, #20-01 CapitaGreen, Singapore 048946

Machine Learning Singapore

138 Market St, #20-01 CapitaGreen, Singapore 048946

Past Event

Welcome! To join the event, please register below.

About Event

LOCATION : Rakuten Offices, near Raffles Place
ENTRY : Before arriving, please fill in this linked form to get a Building Access QR code via SMS : LINK TO BUILDING PASS (NB: Access Level=20, Purpose=Meeting)
FOOD : None (i.e. please eat before/after the event)

Talks:

"Efficient LLM Fine-tuning for Semantic Search" - Dongzhe Wang

In this talk, Dongzhe will explore how large language models (LLMs) can be efficiently fine-tuned to power semantic search systems, using parameter-efficient fine-tuning techniques. Attendees will learn about how LLMs are being applied to improve search in practice, along with the challenges of balancing quality and latency in real-world applications. Dongzhe is a Principal Research Scientist at Rakuten Asia, and obtained a Ph.D. from Nanyang Technological University before working at Shopee and Zhuiyi.

"Efficient Inference and Serving of LLMs and Large Video-Generative Models" - Jonathan Zhao

Jonathan will explore techniques for efficiently serving LLMs and large video generative models. The session will cover methods to optimize inference performance alongside system-level strategies for scalable deployment, highlighting key differences in serving the two model types in practice. Attendees will gain insights into approaches for improving inference and serving, considering balancing quality with latency and other real-world challenges. Jonathan is a Senior Software Engineer at Rakuten, having previously done AI product development at a startup.

"IPhO Gold using Agentic Gemini" - Martin Andrews

A recent paper showed that Gemini Pro 2.5, when driven in an agentic loop, could achieve Gold medal standard on the International Physics Olympiad theory questions. This comes hot on the heels of Google's internal version getting to Gold on the IMO (Mathematics). Martin will briefly talk about the IPhO, how the agentic system works, and show some of the actual questions (so you can see what's involved).

Talks will start at 7:00pm and end at around 8:45pm, at which point people normally come up to the front for a bit of a chat with each other, and the speakers.

HELP WANTED
MLSG needs a few volunteers to help with logistics (like checking people into the event). If you're willing to help, and want to give back to the community, please email Martin at reddragon.ai

Location

138 Market St, #20-01 CapitaGreen, Singapore 048946

Presented by

Machine Learning Singapore

Meeting Calendar for the Machine Learning Singapore meet-up group (the largest AI group in Singapore)

Hosted By

223 Went