Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 26-minute conference talk from SREcon17 Americas that delves into the development and implementation of Sloth, a Go tool designed for inducing network failures in complex infrastructure environments. Learn how Indeed.com tackles the challenges of building resilient systems that can withstand unreliable networks and anticipate potential failures. Discover how Sloth operates as a daemon on every host in the infrastructure, utilizing traffic shaping rules and iptables to simulate slow or lossy networks for specific services without affecting other applications on the same host. Gain insights into the tool's security features, including access control and audit logging, as well as its web UI for manual testing and API for integration testing. Examine real-world examples of how Sloth helped uncover and resolve issues in monitoring, graceful degradation, and usability, ultimately improving system reliability and performance.