Hands-on Workshop: Give your AI Agents Eyes and Ears

Name: Hands-on Workshop: Give your AI Agents Eyes and Ears
Start: 2026-03-19T18:00:00.000+09:00
End: 2026-03-19T21:00:00.000+09:00
Location: Alishan Park

VideoDB

Alishan Park

Shibuya, Japan

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

LLMs gave us reasoning. RAG gave us retrieval. Tool calling gave us action. What’s missing in the modern agent stack is perception: the ability to see, hear, and remember the world as it happens.

This workshop is a practical walkthrough of building a perception layer for agents using VideoDB. You’ll learn how to convert continuous media (screen, mic, camera, RTSP, files) into a structured context your agent can use:

Indexes (searchable understanding)
Events (real-time triggers)
Memory (episodic recall with playable evidence)

We’ll implement the core loop:

Continuous Media → Perception Layer (VideoDB) → Agent (reasoning + action) → Output grounded in evidence

Who should attend:

Engineers building agents that need continuous and temporal awareness (not one-shot screenshots).
Research teams building in physical AI, desktop robots, and wearables.
Product teams building meeting bots, desktop copilots, monitoring/ops, QA/compliance
Founders building multimodal apps where “show me the moment” matters

What You’ll Discover:

What “perception” actually means for agents: continuous, temporal, multi-source, searchable, actionable.
How to support three input modes with one mental model: files, live streams, desktop capture.
How to build searchable memory so your agent can retrieve results with playable evidence, not vibes.
How to move from batch video AI to real-time event streams your agent can react to immediately.

Plus:

A starter template you can reuse: “Index + Events + Memory” as the default perception stack
Networking with builders working on agents + multimodal infra

Location

Alishan Park

5-chōme-63-12 Yoyogi, Shibuya, Tokyo 151-0053, Japan

Presented by

VideoDB

Build agents that watch, listen, understand, and recall in real time

Hosted By

12 Went

AI

Hands-on Workshop: Give your AI Agents Eyes and Ears

​Who should attend:

​What You’ll Discover:

​Plus:

Who should attend:

What You’ll Discover:

Plus: