Distributed Systems in Production: Tactics and Strategy - Lecture 32

Overview

Explore tactics and strategies for productionizing distributed systems in this comprehensive conference talk. Delve into the challenges and solutions for building and running distributed systems at scale, covering topics like partial failure, metrics, profiling, and deployment strategies. Learn about the importance of back-pressure, partial availability, and data locality in system design. Discover techniques for extracting services, implementing service-oriented architecture, and managing on-call rotations. Gain insights into the future of distributed systems and the costs associated with robust implementations. Understand the political aspects of collaboration in distributed systems development and the scarcity of robust open-source solutions.

Syllabus

Intro
Distributed Systems in Production Jeff Hodges 2014-04
Why you should listen to me
Why you shouldn't listen to me
Scale-invariant
Building and running Distributed Systems
Quick foundation
What Makes Distributed Systems Different
Garbage collection spiral on a single machine causes requests to timeout • A process is overloaded, so too many clients get stuck trying to connect to it, so it gets slower • Socket write succeeds locally, but fails on the remote machine
Partial Failure
"It's slow" is the hardest problem you'll ever debug
Metrics are the only way to get your job done.
On profiling
Deploys should change a metric
Logs are liars
Avoid coordination
If your problem fits in memory, it's probably trivial
Back-pressure
Dropping new messages on the floor • Returning documented overload errors until the system clears • Timeouts and exponential back-offs
Create partial availability
Search
Who to Follow in the monorail
Consider a private messaging database
Separating deploy from release
Roll out infrastructure with feature flags
Slow, dark rollouts
Multiple versions are the norm
Exploit data-locality
Extract services
Stricter boundaries means even less cheating
Pulling out a service makes deploys easier
Avoids human coordination costs that libraries require.
SOA through standardization
On-call rotations
The Notorious E.O.C.
Increasing the size of my thought leadership
Robust distributed systems cost more than undistributed systems.
Robust open source distributed systems are less common
Collaboration is politics

Taught by

ChariotSolutions

Reviews

Start your review of Distributed Systems in Production: Tactics and Strategy - Lecture 32

Taught by

Practicalities of Productionizing Distributed Systems

Never Stop Learning.