Cover Image for Perplexity Tech Talk: Under the Hood of LLM Inference
Cover Image for Perplexity Tech Talk: Under the Hood of LLM Inference
Avatar for Perplexity
Presented by
Perplexity
Private Event

Perplexity Tech Talk: Under the Hood of LLM Inference

Registration
Approval Required
Your registration is subject to approval by the host.
Welcome! To join the event, please register below.
About Event

Perplexity is a search and answer engine which leverages LLMs to provide high-quality citation-backed answers.
The AI Inference team within the company is responsible for serving the models behind the product, ranging from single-GPU embedding models to multi-node sparse Mixture-of-Experts language models.

This talk provides more insight into the in-house runtime behind inference at Perplexity, with a particular focus on efficiently serving some of the largest available open-source models.

About the speaker:
Nandor Licker is an AI Inference Engineer at Perplexity, focusing on LLM runtime implementation and GPU performance optimization.

The talk will take place in room FW26 in the Department at 1.05pm with an area outside for food and any discussions afterwards. Lunch will be served on site. Tickets are free, but registration is required.

Location
University of Cambridge
The Old Schools, Trinity Ln, Cambridge CB2 1TN, UK
Avatar for Perplexity
Presented by
Perplexity