

Listening to the World: Multimodal Search with Qdrant
What's Happening
You'll build a multimodal search system that answers questions like "What did the CEO say about margins in last quarter's earnings call?" and returns relevant audio clips, transcripts, and related news articles from a single query.
What You'll Build
Earnings call recordings from Benzinga's Earnings Gold are already being chunked, transcribed, and queued for you. You'll experiment with:
Multimodal embeddings: One Gemini E2 call site that puts audio chunks and article text into the same vector space
Upsert + live feedback: Write points to your own Qdrant Cloud collection
Query layer: Cross-modal and multi-modal retrieval with filtering and hybrid search
Search engineering: Qdrant's APIs enabling recommendations, diversity sampling, time-decay boosting, and relevance retrieval loops
PLEASE BRING: Laptop, charger, headphones. We'll work with audio data, so headphones are a must!
Who Should Come
Search engineers. ML engineers. Backend developers building retrieval systems. Anyone who's touched embeddings and wants to see what multimodal search in production looks like, and learn from the AskNews team how you can turn multimodal retrieval projects into a viable business.
What You Get
A working repo running against your own Qdrant Cloud collection
Hands-on experience with production-grade retrieval APIs
What You Need
A laptop with a browser. Headphones.
Built With
Qdrant · Gemini 2 Embeddings · AskNews · Benzinga Earnings Call Data · HackerSquad