Scaling Reliability - So You Want to Add a 9

Overview

Explore strategies for scaling reliability in microservice architectures through this conference talk. Learn about the challenges and benefits of modular systems, understanding system reliability through monitoring and load tests, implementing recovery mechanisms like alerts and rollbacks, and enhancing reliability with techniques such as retries and circuit breakers. Dive into debugging reliability issues using tracing and logs, and gain insights into the Finagle toolkit designed for building reliable systems. Discover how to measure reliability, manage failures, and participate in a collaborative ecosystem for improving overall system performance and resilience.

Syllabus

Introduction
Reliability
Who am I
Who Cares About Reliability
How Do We Measure Reliability
Own Your Own Failures
Background Noise Failures
probabilistic failure
impact reach high
bad day
failure detection
ride around
load balancing
error codes
microservice architecture
overall success rate
failures on edges
the caller
the service
latency
observability
rollback
rollback example
rolling failures
more problems
planned events
retry storms
retry budget
cluster overload
request not getting response
bad neighbors
toggles
staging
other teams
failing collaboratively
finagle
Is Fif extensible
How would you participate in an ecosystem

Taught by

Devoxx

Reviews

Start your review of Scaling Reliability - So You Want to Add a 9

Taught by

Never Stop Learning.