Explore the intersection of Site Reliability Engineering (SRE) and Machine Learning (ML) in this 44-minute conference talk from SREcon22 EMEA. Delve into why ML matters for SREs, the challenges of ML reliability, and necessary adaptations for the SRE profession. Examine the current state of ML automation in production environments with a critical perspective. Learn about managing ML in production, the complexities of ML implementation, and the distinction between hype and reality in the field. Gain insights into ML Ops, model quality considerations, and the data-sensitive nature of ML. Conclude with future predictions and recommended further reading to stay ahead in the evolving landscape of SRE and ML integration.
Overview
Syllabus
Introduction
Lambda
Dave
TMU
ML does matter
Pause and breathe
Managing ML in production
How hard is ML
Hype and Reality
Gartner Hype Cycle
ML vs AI
ML Ops
Model Quality
ML is data sensitive
I can ML
The future
Future predictions
Future reading
Taught by
USENIX