

Apache Iceberg™ Meetup North Carolina
Join us on February 12th (Thursday) from 6:00-9:30 PM
Connect with fellow enthusiasts, share insights, and dive into the latest developments in the Apache Iceberg™ ecosystem! Whether you're a seasoned pro or new to Apache Iceberg, this meetup is the perfect place to exchange ideas and spark innovation.
Agenda
6:00p - 6:30p: Doors Open & Networking 💃
6:30p - 8:30p: Welcome Remarks & Presentations!
8:30p - 9:30p: More Networking 🕺
The event will focus on use cases around and innovations in Apache Iceberg.
We will discuss topics around Open-Source Data Analytics, Open Table Formats (OTF), software concepts like Transactional Data Lakes or Lakehouse, advancements in AI/ML including generative AI, and many more topics of mutual interest that leverage Apache Iceberg.
During the sessions, we will provide you tips to get involved within the community, you will learn more about how the community is collaborating to grow the technology, and software/solutions that ease problem solving and improve user experiences.
Topics
Introducing Floecat: A Catalog of Catalogs for the Modern Lakehouse
Modern data teams rarely operate a single lakehouse catalog. Iceberg REST, Hive, Glue, Polaris, Unity, and custom catalogs often coexist across clouds, leaving users with fragmented governance, uneven metadata quality, and limited visibility across systems. While engines can increasingly federate data, they still lack a consistent way to federate metadata.
Floecat is a new open source catalog of catalogs that sits in front of existing Iceberg and Delta catalogs and presents them through a single, vendor-neutral API. It does not replace upstream systems or require migrating data. Instead, Floecat aggregates, normalizes, and enriches metadata so query engines, tools, and AI agents can discover tables, schemas, snapshots, and statistics across heterogeneous environments.
In this talk, I'll introduce Floecat’s architecture, including connectors to upstream catalogs, automated generation and augmentation of planner-grade statistics, and how engines like Trino and DuckDB use Floecat’s APIs to perform reproducible query planning across multiple lakes without lock-in or re-platforming.
Mark Cusack is the CTO at Yellowbrick Data. Prior to this, he was VP Data and Analytics at Teradata, leading product management for the database. Before that, VP Analytical Ecosystem at Teradata. And before that, Chief Architect for IoT analytics @teradata. Going even further back, Chief Architect and founding developer @RainStor. Was once a PhD physicist, and a researcher in parallel simulation. And if we’re really stretching it, I used to have a Saturday job at RadioShack when I was at high school.
📲 Follow Mark on LinkedIn
Iceberg Catalogs Landscape: From Hive to REST and Beyond
Iceberg catalogs serve as the control plane for your table metadata, but choosing the right catalog implementation can significantly impact functionality, performance, and operational complexity. Let's demystify the Catalog ecosystem and provide a guide to its rapid evolution.
We'll examine the architectural role of catalogs when using Iceberg, and then survey the leading implementations including Hive Metastore, AWS Glue, Polaris, and Unity Catalog (along with discussing the REST Catalog specification a bit). Key topics will include feature comparisons, integration patterns with query engines and processing frameworks, and trade-offs between hosted services versus self-managed deployments.
In his spare time, Matt Topol likes to bash his head against a keyboard, develop/run delightfully demented games of fantasy for his victims--er--friends, and share his knowledge with anyone interested who'll listen to his rants.
📲 Follow Matt on LinkedIn
The Triple Pruning Strategy for High-Performance Iceberg Streaming CDC Reads
Apache Iceberg provides the primitives for reading CDC records, but building an efficient production-grade streaming source requires solving complex distributed systems challenges. This session provides a deep architectural analysis of Apache Beam’s new Iceberg streaming CDC source, dissecting how snapshot, partition, and file pruning work in concert to eliminate the biggest bottleneck: unnecessary data shuffling. Attendees will learn how to take full advantage of Iceberg’s incremental changelog scan to reduce latency and compute costs.
Ahmed Abualsaud is an Apache Beam Committer and software engineer at Google.
📲 Follow Ahmed on LinkedIn
About Columnar
Columnar is a data infrastructure startup focused on revolutionizing data connectivity with high-performance, Apache Arrow-powered ADBC (Arrow Database Connectivity) drivers, aiming to bring speed, simplicity, and security to the data stack for AI and analytics workloads. They provide tools like their dbc command-line interface to easily install and manage these drivers, making modern, efficient data access easier across platforms like Snowflake, BigQuery, and Trino.
About Coginiti
Coginiti is a secure data operations platform that helps teams clean, transform, and model data for AI, BI, and operational apps. It unifies powerful SQL, a collaborative, versioned workspace, an analytics catalog, and CoginitiScript for reusable development—plus an AI assistant and semantic layer for trusted, explainable insights. Trusted by some of the world’s most secure organizations across defense, healthcare, and financial services.
About Analytics8
Analytics8 is a data and analytics consulting firm that helps organizations turn data into trusted insights and measurable business impact. With expertise spanning strategy, data engineering, analytics, and AI, Analytics8 partners with teams to modernize data platforms, improve decision-making, and build scalable, future-ready analytics solutions.