Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a captivating conference talk that delves into the art of debugging complex systems during critical moments. Learn from Bryan Cantrill, Chief Technology Officer at Joyent, as he shares invaluable insights on maintaining composure and effectively troubleshooting when systems are in chaos. Discover the evolution of software defects in service-oriented architectures and their widespread impact on users. Gain practical knowledge on debugging techniques, creating a culture of effective troubleshooting, and handling operational failures. Examine real-world examples, including lessons from historical accidents and the Three Mile Island incident. Understand the importance of alerts, monitoring, and postmortems in improving system reliability. This 53-minute presentation, recorded at GOTO Chicago 2017, offers a unique perspective on debugging under pressure and equips you with essential skills for managing critical situations in software development and operations.
Syllabus
Introduction
The genesis of this story
The first operator
Total Unknown Land
We dont need a postmortem
Why were we lucky
The nature of software
Historical accidents
Microservices
Murder Mystery
Power Systems
Three Mile Island
Alerts and Monitoring
Debugging the Way
The Art of Debugging
The Craft of Debugging
Creating a Culture of Debugging
Debugging During an Outage
We Dont Believe in Recovery
Postmortem
Operational Failure
Taught by
GOTO Conferences