Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn about continuous reliability practices at Grafana Labs in this technical conference talk that reveals real-world challenges and solutions in maintaining observability tools. Explore how the company solved a costly mystery exceeding $100,000, successfully scaled Mimir clusters to handle 1.3 billion time series metrics, and optimized Loki clusters to process 324 TB of daily logs. Gain insights into the internal monitoring dashboards used for Grafana Cloud and discover valuable lessons learned from production incidents and system failures. Through candid discussions of past challenges and current improvements, understand the practical aspects of implementing observability at scale and maintaining reliability in complex microservices-based systems.