Learn how Wikipedia's infrastructure evolved to handle massive traffic spikes during notable deaths in this 34-minute conference talk from SREcon24 Europe/Middle East/Africa. Explore the journey from experiencing severe outages, including a major disruption in 2020 caused by a combination of a notable death and a DDoS attack, to successfully managing unprecedented traffic during Queen Elizabeth II's passing. Discover the technical improvements implemented by the Wikimedia Foundation team, their approach to educating new Site Reliability Engineers about platform-specific constraints, and the emotional aspects of building resilient systems. Gain insights into their open-source solutions and public codebase, which can be adapted for similar high-traffic scenarios in other platforms.
Overview
Syllabus
SREcon24 Europe/Middle East/Africa - Finding the Capacity to Grieve Once More
Taught by
USENIX