

Availability Is Not an Accident and Performance Is Not a Number
Summary
High availability and high performance are often treated as separate goals. In production, they fail together.
Under overload, retries amplify load, queues grow without bound, and delayed control signals create feedback loops that push systems into metastable states. These failures rarely come from a single bug. They emerge from the interaction of many correct components operating under real-world conditions.
Traditional approaches struggle here. Formal methods abstract away operational realities, while production debugging comes too late to explore the design space safely.
In this talk, we share lessons from building and operating large distributed systems where latency, throughput, and uptime constantly trade punches. We introduce a new class of semi-formal performance modeling tools that simulate the behavior of complex systems by specifying components and their interactions.
We will discuss practical ways teams validate and operate large systems using instrumentation that focuses on right metrics. We will also cover common causes of instability at scale, such as load imbalance and retry storms, and the production techniques used to contain them, including admission control, bounded queues, and load shedding.
About Speakers
Akshat Vig is a Distinguished Engineer at MongoDB. Previously, he spent 15 years at Amazon Web Services building multiple AWS database services.
Aleksey Charapko is an assistant professor at the University of New Hampshire. He broadly works at the intersection of performance, reliability, and efficiency of distributed systems. Aleksey has won the NSF CAREER award for his ongoing work on Metastable Failures, including the seminal "Metastable Failures in the Wild" paper, which demonstrates the breadth of metastable failures' impact in industry. In addition to his academic career, Aleksey has substantial industrial and consulting experience.
Event Location
Labour Temple in the reading room, 2800 1st Ave, Seattle, WA, 98121
Thank you MongoDB for sponsoring the event.