Cover Image for Open Lakehouse and AI
Cover Image for Open Lakehouse and AI
53 Going

Open Lakehouse and AI

Hosted by Open Source Analytics Community & 4 others
Registration
Welcome! To join the event, please register below.
About Event

The OSA Community is proud to host the Open Lakehouse and AI event in Austin!

​As real-time databases integrate more closely with data lakes to reduce storage costs and unlock data for AI and advanced analytics, data infrastructure is evolving fast. Join us to hear from leading experts as they share practical solutions and lessons learned in building open, scalable, and high-performance data platforms.

Food and beverages will be provided!


​Speakers

  • ​Robert Hodges, CEO @ Altinity

  • Steve Anness, Senior Customer Success Architect @ Grafana

  • Kaisen Kang, Head of Query & Agent Team @ CelerData

  • Andrew Madson, Head of Developer Relations @ Fivetran

Agenda

  • ​​6:00 pm - Check-in and networking

  • ​​6:15 - 8:00 pm - Talks

  • ​​8:00 - 9:00 pm - Networking


​​Description of the Talks

​​Building a Foundation for AI with ClickHouse® and Apache Iceberg Storage

​​Speaker: Robert Hodges, CEO @ Altinity

​​Abstract: AI applications need data. Lots of it. Altinity's Project Antalya is adapting open source ClickHouse® to introduce separation of compute and storage on shared Iceberg table data. The result: fast, cheap, flexible query that extends the life of real-time analytic applications and lays the foundation for handling new AI use cases on the same datasets. We cover architecture, performance results, roadmap, and how to get started yourself. 

​​Visualizing Your Data Lake with Grafana

​​Speaker: Steve Anness, Senior Customer Success Architect @ Grafana

​​Abstract: In this brief talk, we’ll walk through how to get started with Grafana’s open source platform to explore and understand your data lake. We’ll cover how to connect to your data—no matter where it lives—then craft queries that turn raw information into clear, compelling visualizations, and finally set up alerts and annotations so you’re always in the know when something important changes in your data lake.

What AI Data Agents Need from an Analytics Engine

Speaker: Kaisen Kang, Head of Query & Agent Team @ CelerData

Abstract: AI data agents rely on iterative, agent-generated SQL to answer questions, explore data, and refine results across multiple turns. In production, this places strict demands on the analytics engine: low-latency execution to maintain conversational flow, high concurrency to support many users and agents, efficient joins and aggregations for real analytical workloads, and strong controls to prevent runaway cost or unsafe queries.

This talk outlines 10 core engine capabilities required to support AI data agents in practice, using StarRocks as an example. We’ll examine how modern analytical engines handle agent-driven query patterns, frequent re-computation, real-time and semi-structured data, and governance at scale—and what to look for when evaluating an engine for AI-powered analytics.

​​Iceberg for Agents: Elevating Lakehouse Data Into AI-Ready Context

​​Speaker: Andrew Madson, Head of Developer Relations @ Fivetran

​​Abstract: AI agents fail in production because even though they're stuffed with data, they're starved for context. Better LLM models aren’t the problem. The bottleneck is the data stack: fragmented silos, inconsistent definitions, and logic hidden in tribal knowledge. Agents need structured, reliable, and interpretable context—not just data access.

​​In this session, we'll show how Apache Iceberg becomes the backbone of AI-ready pipelines. You’ll learn how to elevate your Iceberg implementation from a storage format to a live context layer that powers structured retrieval-augmented generation (RAG), schema-aware agents, and autonomous reasoning grounded in truth.

​​What we’ll cover:

  • ​​Iceberg Foundations for AI - from ACID to Time Travel

  • ​​From Rows to Relationships - The role of the semantic layer

  • ​​Structured RAG in Practice - Fully open source

​​The session includes a live demo of a fully open-source Structured RAG stack built on Apache Iceberg, featuring semantic query translation, hybrid retrieval, and governed agent reasoning. Expect architecture diagrams, real code, and practical guidance.

Location
Q-Branch
200 E 6th St #310, Austin, TX 78701, USA
Suite 310
53 Going