Cover Image for Theory Meets Practice #2: Giving Eyes to your AI: Engineering Multimodal Agents with Haystack
Cover Image for Theory Meets Practice #2: Giving Eyes to your AI: Engineering Multimodal Agents with Haystack
46 Going

Theory Meets Practice #2: Giving Eyes to your AI: Engineering Multimodal Agents with Haystack

Hosted by BLISS Berlin & dida
Register to See Address
Berlin, Berlin
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

We are excited to host a BLISS x dida workshop featuring Bilge Yücel from deepset, who will guide us through an interactive session on Multimodal Agents.

​​ Title: Giving Eyes to your AI: Engineering Multimodal Agents with Haystack
📅 Date: 21.01.2025
🕕 Time: 18:00
📍 Location: TU Berlin Marchstrasse 23 [Room visible on registration]

​​The session will last around 2 hours, followed by a networking session with dida and fellow AI enthusiasts (and free pizza!🍕). Bring your laptop!


Abstract: Large Language Models are no longer limited to text—modern AI systems can see, reason across modalities, and act on rich, multimodal inputs. With these new capabilities comes increased complexity: how do we design multimodal agents that reliably understand visual information, integrate it with language, and remain controllable in real-world applications?

Hosted by this workshop, we explore how to give your AI eyes by engineering powerful multimodal agents using Haystack. The session takes a hands-on, practical approach, guiding participants through the integration of visual inputs, image–text understanding, and the orchestration of modular agent pipelines for production-ready systems.

Through concrete examples, we dive into the architecture of multimodal agents and demonstrate how visual and textual information can be jointly retrieved, grounded, and used to drive informed decisions and actions.

Our host Bilge is a Senior Developer Relations Engineer at Deepset, helping developers build agentic AI apps with Haystack. Passionate about AI, she makes complex concepts approachable through hands-on tutorials, both online and at real-life events

In this session, we will cover:

  • Multimodal Agent Architectures: Building flexible agent pipelines with Haystack that combine text, vision, and retrieval components.

  • Visual Understanding & Grounding: Processing images and visual context and linking them to language and external knowledge.

  • Tool Use & Orchestration: Steering multimodal agents, invoking tools, and managing decision logic in complex workflows.

  • Practical Design Patterns: Best practices, common pitfalls, and strategies for building robust, extensible multimodal systems.

Logistics:
Please bring your laptop to code along. The full environment will be provided via Google Colab.

Who is this event for?
This workshop is open to anyone curious about building multimodal AI systems—including students, PhD candidates, and industry practitioners. A basic understanding of Python and Large Language Models is sufficient to participate.


We are BLISS e.V., Berlin’s AI community connecting like-minded individuals passionate about machine learning and data science. Our BLISS Workshops connect students and young professionals with industry partners, offering an inside look into how machine learning is applied in real-world settings - from research and development to deployment.

We are dida, scientific engineers who believe that reliable AI should not be a "black box". That is why we prioritize transparency, mathematics, and code over hype. By bridging the gap between theoretical research and production, we develop custom white-box AI solutions that are fully explainable, rigorously engineered, and free of "magic".

​​dida Website: https://dida.do

​dida Youtube: https://www.youtube.com/@dida-do

​​BLISS Website: https://bliss.berlin

​​BLISS Youtube: https://www.youtube.com/@bliss.ev.berlin

Location
Please register to see the exact location of this event.
Berlin, Berlin
46 Going