Cover Image for Spark Connect: NVIDIA Accelerator for Spark SQL and MLlib
Cover Image for Spark Connect: NVIDIA Accelerator for Spark SQL and MLlib
Avatar for Apache Spark
Presented by
Apache Spark
Hosted By

Spark Connect: NVIDIA Accelerator for Spark SQL and MLlib

YouTube
Registration
Welcome! To join the event, please register below.
About Event

Please join us 🤝 to learn more about Apache Spark™, Spark Connect, and Spark ML at NVIDIA.

📅 Date: October 29, 2025
⏰ Time: 9:30 AM - 10:30 AM PST (45min talk, then Q&A)
📍 Location: online (live streaming to LinkedIn, X & YouTube)

Agenda:

  • Welcome and Introductions

  • Talk: GPU Accelerated Apache Spark™ Connect: NVIDIA Accelerator for Spark SQL and MLlib

  • Q&A

Talk: GPU Accelerated Apache Spark™ Connect: NVIDIA Accelerator for Spark SQL and MLlib

Abstract:
Spark Connect, first included in Apache Spark™ 3.4 and recently extended to MLlib in Spark 4.0+, introduced a new way to run Spark applications over a gRPC protocol. This has many benefits, including easier adoption for non-JVM clients, version independence from applications, and increased stability and security of the associated Spark clusters.

In this talk, we shall demonstrate how the recent Spark Connect extension for ML, together with Spark SQL’s existing plugin interface, can be used with NVIDIA GPU-accelerated open source plugins for ML and SQL to enable no-code change, end-to-end GPU acceleration of Spark applications over Spark Connect, with performance up to 9x at 80% cost reduction.

We will introduce a working pattern for Spark Connect with accelerated ETL and ML for use in lakehouses. We will discuss how such an architecture can be used in practice and provide a few industry use cases.

Avatar for Apache Spark
Presented by
Apache Spark
Hosted By