

Next-Gen Data Engineering: Embracing AI in Open Source
Next-Gen Data Engineering: Embracing AI in Open Source
Join us October 23rd at the Silicon Valley AI Hub in Snowflake’s Menlo Park campus for an evening exploring how AI is reshaping the roadmaps of today’s leading open source data engineering technologies.
AI isn’t just creating new use cases. It’s changing the very foundation of data engineering. Once seen as a supporting layer, technologies like Apache Iceberg™, Apache Spark™, Apache Parquet, Apache Flink®, and Apache Polaris (incubating) are now evolving to play a central role in powering the next wave of AI innovation.
This discussion-driven event will feature engineers and open source project leaders who will share how AI is influencing their projects and what new features are on the horizon.
If you’re a data engineer or practitioner, you’ll get a firsthand look at the innovations coming your way, engage directly with project contributors, and learn how you can shape the future of these technologies. Plus, there will be plenty of time to network and connect with peers across the community.
Don’t miss this chance to see how open source data engineering is adapting to the age of AI—and how you can be part of it!
Agenda
5:30 pm - 6:00 pm: Doors Open & Networking
6:00 pm - 8:00 pm: Welcome Remarks & Presentations!
8:00 pm - 9:00 pm: More Networking
Talks
More details to come, but here's what you can expect:
Scaling Apache Spark at OpenAI by Chao Sun, Open AI
Column Storage for the AI Era by Julien Le Dem, Datadog
Open Lakehouse Meets AI: MCP, Unstructured Data, and Model Hosting in Polaris by Yufei Gu, Snowflake
Accelerated LLM Inference with Apache Spark at Scale by Rishi Chandra and Lee Yang, NVIDIA
Data-Centric AI with Flink SQL by Hao Li, Confluent
Apache Iceberg™ for the AI-Ready Lakehouse by Huaxin Gao, Snowflake