

Building Real-Time Video Agents with VAST Data Engine
Most video AI demos stop at simple playback or offline analysis. Real-time video intelligence at scale requires ingesting streams, processing content, and retrieving meaningful insights instantly.
WHAT YOU'LL BUILD
A working real-time video agent powered by VAST DataEngine.
You'll implement a full pipeline: from ingesting video streams to generating summaries, detecting events, and retrieving relevant moments using embeddings.
By the end, you'll have a system you can run, tweak, and take back to your team, capable of processing video in real time, flagging key events, and integrating with downstream tools like Slack.
Your pipeline will:
Ingest video via event-driven triggers (S3 buckets)
Generate LLM-powered video summaries
Detect events from video streams
Create video embeddings for semantic search
Retrieve relevant video segments using vector search
Send automated notifications for key events
KEY TOPICS
Event-driven architectures for video processing
Building with VAST DataEngine for AI pipelines
LLM-based video summarisation
Video embeddings and vector search
Designing scalable, real-time video pipelines
Translating prototypes into production systems
AGENDA
4:00 PM — Doors Open: Welcome & Check-In
Security check-in - elevator to 7th floor - grab a coffee/water/soda
4:30 PM — Framing & Vision: What We’re Building and Why
4:45 PM — Live Demo: End-to-End Video Agent in Action
5:00 PM — Guided Build Part 1: Core DataEngine Foundations
(Connect to VAST lab, trigger functions, LLM integration)
6:00 PM — Break
6:10 PM — Guided Build Part 2: Production Features
(Video embeddings, vector queries, user-facing applications)
6:55 PM — Production Wrap-Up: Scaling to Real-World Systems
7:10 PM — Q&A & Next Steps
7:25 PM — Networking with Peers and the VAST Team
8:00 PM — Event Close
LEARNING OUTCOMES
By the end, you'll be able to:
Explain how VLM-powered video agents work in real-time production environments
Use VAST DataEngine to build scalable pipelines for video ingestion and processing
Implement an end-to-end workflow: ingest → process → summarise → embed → retrieve
Apply vector search to surface relevant insights from large-scale video data
Design event-driven architectures for automating video intelligence systems
Understand how to take a prototype and extend it into a production-ready setup
Confidently adapt and reuse the starter repo for real-world use cases
WHO SHOULD ATTEND
Intermediate to senior developers, ML/AI engineers, agent builders, and data engineers.
Industries: AI, Media & Entertainment, Financial Services
PREREQUISITES
Required:
Laptop
Comfortable coding in Python
Familiarity with APIs and basic ML workflows
Helpful (not required): Experience with LLMs, embeddings, or event-driven systems
Setup: You'll connect to the VAST lab environment (no local setup required). Instructions sent 3-5 days before the workshop.
Seats are limited: register now to secure your spot!