Cover Image for South Bay Systems: Apache Pinot on Object Storage / Variants in Apache Doris

Featured in San Francisco

Presented by

South Bay Systems

Systems meetup in the South Bay Area

Hosted By

195 Went

Tech

Featured in

San Francisco

South Bay Systems: Apache Pinot on Object Storage / Variants in Apache Doris

Name: South Bay Systems: Apache Pinot on Object Storage / Variants in Apache Doris
Start: 2025-10-27T18:00:00.000-07:00
End: 2025-10-27T20:00:00.000-07:00
Location: Adobe Founders Tower

South Bay Systems

Adobe Founders Tower

San Jose, California

Past Event

Please click on the button below to join the waitlist. You will be notified if additional spots become available.

You will be asked to verify token ownership with your wallet.

About Event

Welcome to another edition of South Bay Systems! This time, we'll have a double feature! First we'll have Songqiao Su and Raghav Yadav talking about optimizing Apache Pinot for real-time analytics, then we'll have Owen Xiao talking about variants and semi-structured data in Apache Doris.

Agenda

6:00 PM: Doors open, food and socializing
6:30 PM — 7:00 PM: Apache Pinot Talk
7:00 PM — 7:30 PM: Apache Doris Talk
7:30 PM onward : Community socializing!

Food and beverages will be provided, courtesy of our hosts, Adobe.

Low-Latency Serving on Cloud Object Stores with Apache Pinot

In this talk, we present the evolution of Apache Pinot’s architecture: first from tightly coupled storage and compute, to decoupled cloud storage, and now toward native support for Parquet as a first-class segment format. We will discuss key technical innovations such as the implementation of a Parquet-compatible forward index reader, which enables all of Pinot’s indexing strategies to operate directly on Parquet files. Additional optimizations include index pinning, Parquet page-level selective reads, page prefetching for efficient I/O parallelism, and page caching. Together, these enhancements allow Pinot’s indexing and query execution framework to deliver sub-second performance directly on Parquet data, going far beyond conventional metadata-based pruning approaches.

Speaker Bio

Songqiao Su is a Staff Software Engineer at StarTree.AI, working on building tiered storage and improving compute–storage decoupling in Apache Pinot and StarTree Cloud. His work focuses on large-scale, high-performance distributed systems. Before joining StarTree, he worked on network and RPC infrastructure at Facebook and Databricks.

Raghav Yadav is a Staff Software Engineer at StarTree.AI, working on building a low-latency serving layer on Iceberg in Apache Pinot and StarTree Cloud. His expertise spans distributed databases and large-scale systems, with experience in cloud-scale data infrastructure at Microsoft Azure, real-time streaming databases as a founding engineer at Grainite, and now real-time OLAP analytics at StarTree.

The Evolution of Semi-Structured Data Analytics: From Text, JSON to VARIANT

Abstract

Semi-structured data, such as JSON, is gaining widespread adoption due to its flexibility. However, traditional databases and data warehouses are built for structured schemas, creating new challenges in storing and analyzing semi-structured formats. In this session, we’ll explore:

Characteristics and challenges of semi-structured data
Limitations of traditional approaches
Apache Doris’ native solution for semi-structured analytics
Comparison with Snowflake, Iceberg (VARIANT type), and Elasticsearch
Real-world applications in Log Analytics, Distributed Tracing, and IoT

Speaker Bio

Owen Xiao is a co-founder of VeloDB and a PMC member of Apache Doris, where he leads product strategy, observability, and AI-driven R&D for both open-source and enterprise data platforms. With over 10 years of experience in database kernel development and distributed systems architecture, he has helped scale analytical databases for global enterprises.

Location

Adobe Founders Tower

333 W San Fernando St, San Jose, CA 95113, USA

You may park in the underground parking, just tell the gate security folk that your host is Aravind Sriram and you're there for South Bay systems. Then head to the lobby, register with reception, and wait in the lobby area for an Adobe employee to escort you to the talk area.

Presented by

South Bay Systems

Systems meetup in the South Bay Area

Hosted By

195 Went

Tech

South Bay Systems: Apache Pinot on Object Storage / Variants in Apache Doris

​​​Agenda

​Low-Latency Serving on Cloud Object Stores with Apache Pinot

​Speaker Bio

​The Evolution of Semi-Structured Data Analytics: From Text, JSON to VARIANT

​Abstract

​Speaker Bio

Agenda

Low-Latency Serving on Cloud Object Stores with Apache Pinot

Speaker Bio

The Evolution of Semi-Structured Data Analytics: From Text, JSON to VARIANT

Abstract

Speaker Bio