Overview
Explore how Equinix Metal implemented OpenTelemetry tracing for bare metal provisioning in this 32-minute conference talk from SREcon22 Americas. Learn from Principal Engineer Amy Tobey and SRE Shelby Spees as they share their journey from frequent, long-lasting incidents to improved reliability through distributed tracing. Discover how the team created on-ramps for engineers across the organization to instrument their own code, facilitating knowledge transfer and empowering both veterans and newcomers to debug issues more efficiently. Gain insights into system issues identified through tracing and major reliability wins achieved. Understand the challenges and solutions in implementing OpenTelemetry across a globally-distributed, multidisciplinary team managing two dozen software services deployed on 70+ Kubernetes clusters across six continents.
Syllabus
Introduction
Turning on OpenTelemetry
What is the problem
Setting up OpenTelemetry
Testing
Taught by
USENIX