

HydPy is hosting an NVIDIA Nemotron 3 Super Workshop - India
Mark your calendars!
HydPy is hosting an NVIDIA Nemotron 3 Super Workshop
This interactive workshop dives deep into Nemotron 3 Super, NVIDIA’s newly released 120B (12B active-parameter) open hybrid Mamba-Transformer Mixture-of-Experts (MoE) model. Designed to solve complex, dense technical problems autonomously, Nemotron 3 Super handles long-context analysis, precise reasoning, and coding while remaining computationally efficient. Throughout this course, participants will explore the model's architectural innovations such as Latent MoE, Multi-Token Prediction (MTP), and its hybrid backbone and learn how to customize, optimize, and deploy it using NVIDIA's fully open weights, datasets, and recipes.
Full agenda details (TBA)
Agenda
8:30 AM – 9: 30 AM
Registration9: 30 AM - 9:45 AM
HydPy Introduction9:45 AM - 10:15 AM
NVIDIA Nemotron Strategy: Open Datasets, Weights, Frameworks and Recipes.Speaker : Megh Makwana
Manager, Applied Gen AI Solution Engineering, NVIDIA
Megh’s work converges on building foundational AI models, scaling GPU workloads efficiently, and supporting CSPs in developing AI platforms with NVIDIA AI Enterprise.
10:15 AM - 10:45 AM
Deep-Dive into Nemotron 3 Super Architecture Design10:45 AM - 11:30 AM
Efficient Serving & Deployment of Nemotron Model using vLLM, SGLang and TensorRT-LLM Cookbooks11:30 AM - 11:45 AM
Break11:45 AM – 01:00 PM
Building Agentic Workflow using Nemotron models1:00 PM - 1 :45 PM
LUNCH1:45 PM - 2 :15 PM
Use case/Demo by Customer/Partner2:15 PM - 3:00 PM
Talk from HydPy3 PM - 4 :00 PM
Paper Presentation on NVIDIA Nemotron 3 Super4:00 PM - 4:15 PM
Snack Break4:15 PM - 4:30 PM
Project & Blog Contest Announcement by HydPy4:30 PM Onwards
Engagement Activities with HydPy,Winners Announcements, Open Discussion , Networking
Prerequisites:
Languages/tools—Python
Frameworks— PyTorch, TensorRT-LLM, Triton Inference Server, SGLang, vLLM.
Sign-up with build.nvidia.com and create your account
Bring your Laptop to this Workshop
Laptop with internet access - Ideal minimum: 5 Mbps download / 1–2 Mbps upload. This will ensure consistent access to the lab. This will ensure consistent access to the lab.
Showcase Your NVIDIA Nemotron 3 Super ?
Are you interested in presenting your work, project, or paper related to NVIDIA Nemotron 3 Super?
Submit your details by 1st May. Selected participants will get an opportunity to present, and the best presentation will be rewarded on the event day.
Apply here: https://forms.gle/7AGSHWr4XiC66sxo7
Note: Filling this form you will provide consent to share your details with the host company, IIIT-H and NVIDIA for the purpose of event coordination and communication. You will receive the Invite from LUMA 1-2 days before the event that will facilitate check-in process.
Venue:
International Institute of Information Technology, Hyderabad
Professor C. R. Rao Road, Gachibowli, Hyderabad – 50003
Want to know more about NVIDIA Nemotron 3 Super?
Explore the following resources to understand the architecture, capabilities, and real-world applications of NVIDIA Nemotron 3 Super:
Want to know more about NVIDIA Nemotron 3 Super?
Explore the following resources to understand the architecture, capabilities, and real-world applications of NVIDIA Nemotron models:
NVIDIA Nemotron NVIDIA Nemotron Models
NVIDIA Nemotron 3 Super Model Blog Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning
NVIDIA Nemotron 3 - GitHub Repo NVIDIA-Nemotron-3-Super-120B-A12B-FP8
NVIDIA Nemotron 3 - Tech Blog New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI
NVIDIA Nemotron 3 Tutorial Nemotron 3 Super Tutorial: Multi-Token Prediction, Latent MoE, Perplexity and OpenCode Integration
Hugging Face Get started with Nemotron 3 Nano on Hugging Face
Target Audience
This workshop is crafted for:
AI/ML Engineers & Researchers: Professionals building advanced LLMs and specialized reasoning models.
Generative AI Developers: Developers creating complex, multi-agent AI applications (such as software development or cybersecurity triaging agents).
Data Scientists: Practitioners interested in state-of-the-art training techniques like Native NVFP4 pretraining and multi-environment reinforcement learning (RL).
DevOps & MLOps Engineers: Professionals looking to deploy high-throughput, low-latency AI models using vLLM, SGLang, or NVIDIA TensorRT-LLM.
Participants should have basic knowledge of Python, containerized environments, and experience working in Jupyter/Colab or similar notebook workflows.
Why Should Someone Attend It?
Accelerate / Optimize LLM Inference: Learn how to maximize compute efficiency without sacrificing accuracy. You will explore how Nemotron-3 Super achieves a 5x throughput increase over previous generations. We will dive into architectural breakthroughs like Latent MoE (which calls 4x as many experts for the exact same inference cost), Multi-Token Prediction (MTP) for built-in speculative decoding and 3x faster generation, and memory-saving Native NVFP4 pretraining.
Industry Relevant Use-Cases: Move beyond simple chatbots and into the realm of autonomous, multi-step AI agents. You will discover how to apply the model's massive 1M-token context window to real-world scenarios like dense software development and cybersecurity triaging. We will also cover the highly efficient "Super + Nano" deployment pattern, teaching you how to intelligently route simple tasks to smaller models and complex planning to the 120B Super model to optimize cloud costs.
End-to-End Practical Skills: Walk away with the hands-on experience needed to build and deploy immediately. Because Nemotron 3 Super is fully open, you will learn how to leverage NVIDIA’s open weights, datasets, and recipes. You will get practical experience using deployment cookbooks (vLLM, SGLang, TensorRT-LLM) for low-latency production, and explore fine-tuning techniques (LoRA/SFT, GRPO/DAPO) using the NeMo ecosystem to customize the model for your specific proprietary data.
Learning Objectives :
By the end of the workshop, attendees will be able to:
Understand Advanced Architectures: Grasp the mechanics behind the Hybrid Mamba-Transformer backbone, Latent MoE (which calls 4x as many experts for the same compute cost), and Multi-Token Prediction (MTP) for built-in speculative decoding.
Navigate the Training Pipeline: Understand how the model achieves stability and accuracy through Native NVFP4 pretraining and trajectory-based reinforcement learning across diverse environments (using NeMo Gym and NeMo RL).
Implement the "Super + Nano" Pattern: Learn how to architect multi-agent workflows that smartly route simple tasks to Nemotron 3 Nano and complex planning/reasoning tasks to Nemotron 3 Super.
Deploy and Fine-Tune: Utilize NVIDIA’s open resources and deployment cookbooks (vLLM, SGLang, TensorRT-LLM) to customize and run the model on your own infrastructure.
About OSDG, IIIT Hyderabad
OSDG (Open Source Development Group) is the leading technical club of IIIT Hyderabad, focused on community building and open-source initiatives within one of India’s premier technical and research institutes. OSDG fosters a strong culture of collaboration, learning, and open contribution through hands-on projects, workshops, and community-driven programs. We are grateful to OSDG for their invaluable organizational and technical support in making this meetup possible. Learn more @ osdg.in
Interested in Speaking at a Future HydPy Meetup?
Submit your proposal by creating a GitHub issue here: http://bit.ly/hydpy-cfp
Join the Conversation
Website: hydpy.org
Twitter: @hydPython
Meetup: bit.ly/hydpy-meetup
Telegram: https://t.me/HydPy
LinkedIn: https://www.linkedin.com/company/hydpy/