Overview
Syllabus
Overview and Importance of Data Quality for Machine Learning Tasks
Acknowledgements
Data Preparation in Machine Learning
Challenges with Data Preparation
Data Quality Analysis can help..
Different personas in enterprise setting..
To put it all together
To summarize
Data Quality Metrics
Common Data Cleaning Techniques
Is data cleaning always helpful for ML pipeline?
Insights: Impact of different cleaning techniques
In conclusion
Why it happens?
Why Imbalanced Classification is Hard?
Evaluation Metrics for Imbalanced Datasets Accuracy Paradox
Factors affecting class imbalance
Affecting Factor: Imbalance Ratio
Affecting Factor: Overlap
Affecting Factor: Smaller sub-concepts
Affecting Factor: Dataset Size
Affecting Factor: Combined Effect
Modelling Strategies: Types
Resampling Techniques
Bayes Impact index
Taught by
Association for Computing Machinery (ACM)