Cover Image for Lakehouse at Scale: Doris x OLake
Cover Image for Lakehouse at Scale: Doris x OLake
Avatar for OLake Community Events
We organise community events and webinars surrounding Data enginnering topics like CDC, Apache Iceberg, ETL from Database to Data Lakehouses
5 Going

Lakehouse at Scale: Doris x OLake

Register to See Address
Bengaluru, India
Registration
Approval Required
Your registration is subject to host approval.
Welcome! To join the event, please register below.
About Event

Lakehouse at Scale

Apache Iceberg adoption is accelerating, and with it come two operational realities data teams are running into head-on: table maintenance at scale and the demand for real-time, accurate retrieval powering AI systems.

This meetup brings together practitioners and contributors working at both ends of that problem. Expect technical depth, real production context, and open discussion with engineers who are actively building and operating lakehouse infrastructure.

Agenda

11:00 - 11:30 | Registration and Welcome

11:30 - 12:10 | How Apache Doris Powers AI Agents with Hybrid Search and Real-Time Analytics Matt Yi, Apache Doris PMC Member, Tech VP at VeloDB

  • Why single-method retrieval (vector-only or keyword-only) breaks down in production AI systems

  • Hybrid search architecture: combining vector search, full-text search, and SQL for accurate, intent-aware retrieval

  • How Apache Doris's native real-time OLAP capability extends into real-time RAG pipelines

  • Cost and accuracy tradeoffs across retrieval strategies and what that means for context engineering at scale

12:10 - 12:50 | OLake Fusion: Solving Apache Iceberg Table Maintenance Problems at High Scale Ankit Sharma, Tech Lead + Badal Prasad Singh, Software Engineer, OLake

  • Why continuous CDC ingestion at scale creates small file accumulation and query performance degradation in Iceberg tables

  • Compaction strategies (lite, medium, full) and how to choose the right mode based on workload and file size targets

  • Cron-based scheduling, table enable/disable controls via Helm and Docker Compose

  • Multi-catalog support and lessons from building maintenance systems that do not interrupt live ingestion

12:50 - 1:00 | Break

1:00 - 1:30 | Apache Doris User Sharing Nilanjan Sarkar

  • Production experience taking Apache Doris from evaluation to live deployment

  • Practical challenges and decisions made along the way

1:30 | Lunch and Networking

Location
Please register to see the exact location of this event.
Bengaluru, India
Avatar for OLake Community Events
We organise community events and webinars surrounding Data enginnering topics like CDC, Apache Iceberg, ETL from Database to Data Lakehouses
5 Going