Unsupervised Machine Learning for Scaling Data Quality Monitoring in Databricks
Databricks via YouTube
Overview
Syllabus
Intro
Data Quality in the Modern Data Stack
Three Approaches to Data Quality Monitoring
Ticket Sales Data
Setup Monitoring in Anomalo
Anomalo Monitoring
Chaos Library
Check Log
Visualizations: Severity & Explanation
Visualizations Distribution
Visualizations: Root Cause Analysis
Encode Features Automatically
Build a Supervised Model
Generate Visualizations Using SHAP Values
Challenges
Testing
Get Started in Databricks
DATA+AI SUMMIT 2022
Taught by
Databricks