Cover Image for PyTorch x GPU MODE Hackathon: No gradient descent. Only ascent.

Presented by

Verda (formerly DataCrunch)

The European provider of sovereign, scalable GPU infrastructure – pushing the boundaries of efficient AI.

Hosted By

AI

Featured in

Paris

PyTorch x GPU MODE Hackathon: No gradient descent. Only ascent.

Verda (formerly DataCrunch)

Register to See Address

Paris, France

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Join us to wrap up the PyTorch Conference with a GPU MODE x PyTorch IRL hackathon.

You can expect an intimate yet advanced day-long hackathon alongside researchers and engineers working at the bleeding edge of ML systems.

Here's what you can expect:

Two tracks on distributed training and inference optimization
Access to a B300 cluster from Verda and H200 instances from Sesterce
Cloud credits as prizes, including 48-hour access to a GB300 NVL72 rack
Talks from PyTorch (Helion), vLLM, Prime Intellect, ISTA, Sesterce, and SemiAnalysis
Food and refreshments

More details on the tracks, teams, and prizes below ⬇️

Schedule

9:30 - Doors open
10:00 - Opening words + tasks revealed
10:10 - Tech talks
- 10:10 - Will Feng, PyTorch, Helion: A High-level DSL for Kernel Authoring
- 10:25 - Tyler Michael Smith, vLLM: Large-scale inference 101
- 10:40 - Matej Sirovatka, Prime Intellect: Reinforcement learning at scale
- 10:55 - Erik Schultheis, ISTA: Low-precision formats
11:10 - Team formation
11:30 - Tasks start
13:30
- Lunch
- Fireside chat: Power is the new compute
  - Youssef El Manssouri, Sesterce
  - Jeremie Eliahou Ontiveros, SemiAnalysis
19:30 - Tasks end & dinner
20:00 - Closing ceremony
23:00 - Doors close

Tracks

1st track: LLM training on B300 cluster by Verda

Task: Pre-train an LLM from scratch in a limited time on a B300 cluster. You will participate as a team, trying to make optimal use of 360 PFLOP/s (BF16). Will this require asynchronous optimizer steps, large-batch size training, models much larger than chinchilla, or something else entirely?

Compute: Your team will get a B300 node for development and a set time on a B300 cluster.

Teams:

Come with a pre-formed team or get matched to a team on-site
The expected team sizes are 2-4

Prizes:

🥇 1st place: 48 hours on GB300 NVL72 + 2,500 EUR in cloud credits
🥈 2nd place: 2,000 EUR in cloud credits
🥉 3rd place: 1,500 EUR in cloud credits

2nd track: LLM inference on H200 GPUs by Sesterce

Task: Compete as a team on a leaderboard for the fastest inference for a fixed model.

Compute: Your team will get access to 4x H200s for the duration of the hackathon.

Teams: Similarly, you can come with a pre-formed team or get matched on-site.

Prizes:

🥇 1st place: 5,000 EUR in cloud credits
🥈 2nd place: 2,000 EUR in cloud credits
🥉 3rd place: 1,500 EUR in cloud credits

Lightning talks

Speakers, topics, and times TBA soon!

PyTorch x GPU MODE Hackathon: No gradient descent. Only ascent.

​Schedule

​Tracks

​1st track: LLM training on B300 cluster by Verda

​2nd track: LLM inference on H200 GPUs by Sesterce

​Lightning talks

​Sponsors

Schedule

Tracks

1st track: LLM training on B300 cluster by Verda

2nd track: LLM inference on H200 GPUs by Sesterce

Lightning talks

Sponsors