

NYC Lakehouse Meetup
We're bringing together Apache Iceberg, Lance, and Apache DataFusion communities in NYC to chat about all things open lakehouse and data infrastructure at Cloudflare's NYC office!
Hosted by Cloudera, LanceDB, and Cloudflare
(See bottom for important registration steps)
Talk 1: Apache Iceberg - Spec Evolution (v1 to v4) and How Cloudera's Data Platform Supports It.
Speaker: Dipankar Mazumdar, Director - Developers (Cloudera)
This session will go over Apache Iceberg’s evolution and the problems each specification set out to solve. After a brief look at v1 and v2, we will dive into v3 and the upcoming work (v4 & beyond) - covering lineage, deletion vectors, metadata redesign, File format APIs, and why these changes matter for developers building lakehouse pipelines at scale. We will also see how Cloudera's data platform has supported Iceberg's core capabilities since its early support.
Talk 2: Multimodal AI Lakehouse with Lance & LanceDB
Speaker: Chang She, Co-Founder & CEO (LanceDB)
The next wave of AI applications demands seamless, scalable access to text, images, embeddings, and other complex modalities—but current lakehouse solutions still force teams into closed systems for vector search, full-text search, or feature engineering, reintroducing data silos.
In this talk, we introduce Lance, a next-generation columnar data format optimized for AI, and LanceDB, the multimodal lakehouse built on top of it. Together, they provide low-latency access, unified vector, full-text, and SQL search, and flexible schema evolution across the entire multimodal AI lifecycle—from application serving to feature engineering and large-scale training, empowering innovators like Midjourney, WorldLabs, and Runway to build open, performant, and production-grade multimodal systems at scale.
Talk 3: Cloudflare's Data Platform with Apache Iceberg & DataFusion
Speaker: Jonathan Chen, Software Engineer (Cloudflare)
In this talk, we introduce Cloudflare’s new data platform, composed of R2 Data Catalog, R2 SQL, and Pipelines. Built on Apache Iceberg and Apache DataFusion, the platform enables users to ingest, manage, and query large-scale data directly on object storage. We’ll walk through the system architecture, explain how each component fits together, and show how Cloudflare makes it possible to run SQL analytics over continuously ingested data without managing separate compute or storage systems.
IMPORTANT INFORMATION REGARDING REGISTRATION:
Please ensure that the name on your registration matches the full name as is present on your government ID. This is required for building security purposes.
**If your Luma registration name does not match your government ID name, you will not be allowed to enter the building. You can update your Luma name under Luma > Profile > Edit Profile.**
.You must show your government ID to the building security when you arrive. They will verify that the name from your Luma registration matches your government ID.
You must register and have been accepted to this event to attend.
Walk-ins are not allowed: if you are on the waitlist, or have not registered ahead of time, you will not be permitted into the building. This is strictly enforced by building security.
.You will receive an email a few days before the event asking you to register to Cloudflare's Verkada system. This will grant you access to the Cloudflare floor. Please fill this before arriving.
If you register and can no longer attend, please indicate "Not going" on Luma to free up your space so another person can attend.
Cloudflare office is on the 88th floor. To access the 88th floor, you'll need to take the elevator up to the 64th floor and cross over to the "I" bank elevators to get to our NYC hub.
Thanks for bearing with us through these steps, trust us the view is worth it! Looking forward to seeing you all on the day of the event!