Courses from 1000+ universities
Two years after its first major layoff round, Coursera announces another, impacting 10% of its workforce.
600 Free Google Certifications
Communication Skills
Project Management
Graphic Design
Functional Programming Principles in Scala
Supporting Successful Learning in Primary School
Transport Systems: Global Issues and Future Innovations
Organize and share your learning with Class Central Lists.
View our Lists Showcase
Explore all talks and presentations from SREcon. Dive deep into the latest insights, research, and trends from the world's leading experts.
Explore strategies for building distributed service ownership in software teams, focusing on documentation, telemetry, and empowering teams to drive improvements in system reliability and performance.
Explore black box monitoring of 25,000 government endpoints, learning strategies to monitor resistant systems and improve digital services across the US government.
Explore GitHub's innovative 1:1 SRE outreach and incident debrief programs, designed to foster a culture of resilience through empathy and psychological safety, enhancing organizational reliability.
Explore best practices for SRE training, including sequential learning, hands-on experiences, and continuous education. Learn from Google's insights on building effective SRE teams and fostering a culture of reliability.
Explore canarying best practices, pitfalls, and strategies for safe production changes. Learn to balance priorities, handle diverse scenarios, and implement effective canary processes in software deployment.
Humorous talk contrasting idealized SRE practices with real-world challenges. Speakers debunk perfect environments, offering practical insights and relatable experiences for SRE professionals.
Explore Netflix's multi-region strategy for improved availability and latency, including algebraic models, incident management, and design considerations for efficient failovers and user steering.
Discover how to identify and mitigate hidden vulnerabilities in microservice architectures using OpenTelemetry, with real-world examples from Google Maps' high-risk dependencies.
Explore biases in SRE, their impact on organizations, and strategies for mitigation. Learn to identify, discuss, and address cognitive biases and stereotypes to improve workplace equity and SRE integration.
Explore Wikipedia's server-side architecture, from routers to microservices, and learn how open-source technologies power one of the world's top websites.
Discover strategies to enhance organizational resilience through improved incident learning. Explore research-backed approaches to post-incident reviews and avoid common investigation pitfalls.
Practical guidance on implementing Site Reliability Engineering in smaller organizations, addressing unique challenges, gaining buy-in, and fostering a culture of continuous improvement and experimentation.
Introductory overview of formal verification techniques in industry, focusing on safety-critical systems. Explores tools, applications, and adaptability to existing infrastructures.
Explore Pinterest's journey in scaling observability tools, from metrics to log search and distributed tracing, as the company grew from startup to web-scale platform.
Explore Adaptive Paging, an innovative alert handler that uses tracing and heuristics to identify and notify the team closest to the problem, reducing alert fatigue in complex distributed systems.
Get personalized course recommendations, track subjects and courses with reminders, and more.