

OLake 10th Community Call
What we’ll cover
New Sources Added
We’ve expanded Olake’s source ecosystem with new, production-ready integrations:
S3 Source Integration
Read data directly from S3-compatible storage in CSV, JSON, and Parquet formats.
Works with AWS S3, MinIO, and LocalStack, supports IAM-based authentication, and enables flexible file discovery using glob patterns.MsSQL Source
Native support for Microsoft SQL Server as a source, allowing teams to ingest data from existing MSSQL deployments and write it into Apache Iceberg.DB2 Source
Enterprise-grade DB2 source support, enabling seamless ingestion from IBM DB2 into Iceberg-backed lakehouse architectures.
For all new sources, documentation has been added so teams can easily plug these into their existing architecture and push data into Apache Iceberg.
MOR → COW Architecture Improvements
Olake ingests CDC data using Merge-on-Read (MOR) with equality deletes. However, many query engines (such as Databricks and Snowflake) do not fully support equality deletes, which can lead to incorrect query results.
To address this, we’ve introduced a MOR to COW compaction script that:
Periodically converts MOR tables into Copy-on-Write (COW)
Produces clean, query-ready Iceberg tables
Uses WAP (Write-Audit-Publish) for atomic checkpointing
Supports idempotent re-runs and automatic failure recovery
Ensures correctness without sacrificing ingestion performance
Kubernetes & Job Execution Enhancements
We’ve introduced major improvements to job execution and scheduling:
Transition from Job Mapping to Job Profiles
Zero-based mapping support
Full Kubernetes scheduling control using:
NodeSelector
Tolerations
Affinity
Backward compatibility with existing job mappings
These changes provide better scalability, flexibility, and control in Kubernetes-based deployments.
Community Highlights
This call also focuses on the people behind Olake:
Contributor spotlights and shoutouts
Updates from the Social Winter of Code (SWOC) program
Recognition of new contributors and their impact
Highlights from recent community blogs and company case studies
Future Events
We’ll close the session by sharing upcoming:
Community calls
Hackathons and workshops
Opportunities to contribute and get involved