Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the critical importance of resilience in software systems through this conference talk from GOTO Copenhagen 2015. Delve into the concept that without resilience, all other aspects of an application become irrelevant. Learn about designing for failure, managing complexity, and implementing strategies for robust error handling. Discover insights on social systems resilience, embracing crashes, state management, and preventing cascading failures. Examine the roles of supervision, self-healing mechanisms, and error kernels in building resilient systems. Investigate the significance of diversity, redundancy, and decoupling in time for distributed systems. Gain knowledge about resilient protocols and testing tools to enhance system reliability. Equip yourself with essential understanding to create truly resilient software applications that can withstand and recover from failures.
Syllabus
Introduction
What is resilience
What is complex
Design for failure
Social systems resilience
meerkats
Resilience is nothing
How to manage failure
Root cause
Embrace crash
Recursive wrist
State management
Failure management
Cascading failure
Supervision
Selfheal
Error Kernel
Diversity and redundancy
Distributed systems
Life beyond the illusion of present
Decoupling in time
Resilient protocols
Testing Tools
Summary
Taught by
GOTO Conferences