

NVIDIA Nemotron™ 3 Super – Workshop
This hands-on workshop dives deep into NVIDIA’s newly released Nemotron 3 Super, a powerful 120B open hybrid Mamba-Transformer Mixture-of-Experts (MoE) model built for complex reasoning, long-context analysis, and autonomous problem-solving. You will explore key architectural innovations like Latent MoE and Multi-Token Prediction, and learn how to build, optimize, and deploy real-world AI systems using NVIDIA’s open weights, datasets, and deployment recipes.
WHO SHOULD ATTEND
AI/ML Engineers & Researchers: Professionals building advanced LLMs and specialized reasoning models.
Generative AI Developers: Developers creating complex, multi-agent AI applications (such as software development or cybersecurity triaging agents).
Data Scientists: Practitioners interested in state-of-the-art training techniques like Native NVFP4 pretraining and multi-environment reinforcement learning (RL).
DevOps & MLOps Engineers: Professionals looking to deploy high-throughput, low-latency AI models using vLLM, SGLang, or NVIDIA TensorRT-LLM.
AGENDA
10:00 AM – 10:30 AM: NVIDIA Nemotron strategy – open datasets, weights, frameworks, and recipes
10:30 AM – 11:00 AM: Deep dive into Nemotron 3 Super architecture design
11:00 AM – 12:15 PM: Efficient serving and deployment using vLLM, SGLang, and TensorRT-LLM Cookbooks
12:15 AM – 1:30 PM: Building agentic workflows using Nemotron models
1:30 PM - 2:15 PM - Lunch
2:15 PM - 2:30 PM - Talk by Partner
2:30 PM - 2:45 PM - Paper/project presentation 1
2:30 PM - 2:45 PM - Paper/project presentation 2
2:45 PM - 3:30 PM - Talk by Partner
3:30 PM - Networking & Wrap-Up
KEY LEARNINGS
By the end of the workshop, attendees will be able to:
Understand Advanced Architectures: Grasp the mechanics behind the Hybrid Mamba-Transformer backbone, Latent MoE (which calls 4x as many experts for the same compute cost), and Multi-Token Prediction (MTP) for built-in speculative decoding.
Navigate the Training Pipeline: Understand how the model achieves stability and accuracy through Native NVFP4 pretraining and trajectory-based reinforcement learning across diverse environments (using NeMo Gym and NeMo RL).
Implement the "Super + Nano" Pattern: Learn how to architect multi-agent workflows that smartly route simple tasks to Nemotron 3 Nano and complex planning/reasoning tasks to Nemotron 3 Super.
Deploy and Fine-Tune: Utilize NVIDIA’s open resources and deployment cookbooks (vLLM, SGLang, TensorRT-LLM) to customize and run the model on your own infrastructure.
SPEAKER
Megh Makwana - Manager, Applied Gen AI Solution Engineering, NVIDIA
Megh’s work converges on building foundational AI models, scaling GPU workloads efficiently, and supporting CSPs in developing AI platforms with NVIDIA AI Enterprise.
LinkedIn: https://www.linkedin.com/in/megh-makwana-4a378a147
PREREQUISITES
Languages/tools - Python
Sign-up with build.nvidia.com and create your account
Bring your fully-charged laptop as well as the charger to this workshop. Preferably bring your personal laptop as some company laptops may have restrictions on website or AI access.
FEE
This meetup is FREE to attend but seats are limited and available on an invite-only basis. Prior registration is required for receiving an invitation, as per the below process.
REGISTRATION
To register, please do BOTH of the following:
Fill in your details in this Luma form by clicking "Request To Join"
Download the Deep Tech Stars app here: https://www.deeptechstars.com/aboutUs/app (optional referral code for signing up: DTSNVID10)
You must follow the above procedure and receive an official invite. Please note that we will not be able to accommodate walk-ins at the event.
CONTEST
We have a contest at this event! The top 5 projects, the top 5 blogs, and top 5 papers about NVIDIA Nemotron 3 will each receive some cool NVIDIA swag! So make sure you participate and share your learning from the event in the form of a project, blog or paper. You can make the submission by sending the link to us at [email protected].
Early bird prizes: We have some early bird prizes, available for the first 5 submissions! If you have already done some work on NVIDIA Nemotron 3, you can submit your project, blog or paper in the application form directly or by emailing it to us at [email protected].
USEFUL LINKS
NVIDIA Nemotron – NVIDIA Nemotron Models - Learn More
NVIDIA Nemotron 3 Super Model Blog - Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning – Learn More
NVIDIA Nemotron 3 - Github Repo - NVIDIA-Nemotron-3-Super-120B-A12B-FP8 – Learn More
NVIDIA Nemotron 3 - Tech Blog - New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI – Learn More
NVIDIA Nemotron 3 - Nemotron 3 Super Tutorial: Multi-Token Prediction, Latent MoE, Perplexity and OpenCode Integration – Learn More
Please reach Nihal at 9663374431 and Talib at 7977757472 if you need any clarifications or have any challenges in registration. We look forward to seeing many of you there!
Our thanks to NVIDIA and FAIR (Folks in AI Research) for collaborating with us for this session, and special thanks to our Venue Partners Cactus Communications for hosting us.