When the Magic Wears Off - Flaws in ML for Security Evaluations and What to Do About It
Security BSides London via YouTube
Overview
Explore the flaws in machine learning for security evaluations and learn effective solutions in this conference talk. Delve into the endemic issue of inflated results caused by spatial and temporal biases in academic research on machine learning-based malware classification. Discover a set of space and time constraints for experiment design and a new metric that summarizes classifier performance over time. Examine the TESSERACT open-source evaluation framework, which enables fair comparison of malware classifiers in realistic settings. Gain insights into the distortion of results due to experimental bias and learn about significant improvements achieved through tuning. Cover topics such as cross-validation, temporally inconsistent datasets, time decay, bias from imbalanced testing, and evaluation constraints.
Syllabus
Intro
ML for Security
The magic of cross-validation
The curse of cross-validation
Temporally inconsistent datasets
Time Decay
Bias From Imbalanced Testing
Tuning the Training Ratio
Evaluation Constraints
Discussion (2/2)
Conclusion
Taught by
Security BSides London