Tame MTTR with Real-Time Anomaly Detection in Cloud-Native Systems
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore real-time anomaly detection techniques for cloud-native systems in this 36-minute conference talk from Apple Inc. engineers. Learn to reduce Mean Time To Resolution (MTTR) and analysis time by implementing statistical and machine learning algorithms on Prometheus time series data to proactively identify abnormal application behavior in Kubernetes clusters. Master practical techniques for metrics selection, visualizing anomalies through Grafana, and handling dynamic metrics where static thresholds fall short. Discover methods to pinpoint issues, identify missing instrumentation, and maintain optimal application health through proactive issue identification. Through demonstrations and examples, gain valuable insights into managing cloud-native applications more effectively and achieving faster issue resolution when nodes restart or applications become unreachable.
Syllabus
Now You See Me: Tame MTTR with Real-Time Anomaly Detect... Kruthika Prasanna Simha & Raj Bhensadadia
Taught by
CNCF [Cloud Native Computing Foundation]