Overview
Explore anomaly detection in real-time systems through this conference talk. Learn how Allegro developed a simple yet effective statistical model for detecting anomalies in web traffic, search events, and ad clicks. Discover the journey from initial R language experiments to a final Scala implementation. Gain insights into machine learning, statistics, and real-time processing techniques. Understand the challenges of deploying services to production and the importance of proactive error detection. Follow the speaker's process of testing various solutions, including Twitter detector and HTM algorithms, before creating a custom model. Delve into topics such as simple counts, outliers, EEMA, and soft modeling. Examine the pros and cons of the approach, aggregated data handling, and Druid architecture. Conclude with a demonstration of the implemented solution.
Syllabus
Intro
Who am I
Why are we doing this
What was our motivation
What is an anomaly
Simple counts
First look
The best algorithm
A simple model
First attempt in learning
Outliers
A sad conclusion
Simple input
Scala model
EEMA
What might go wrong
The algorithm
The last problem
The probability
Long lasting anomaly
Soft model
Thank you
Pros and cons
Aggregated data
Topend queries
Druid architecture
Demo
Taught by
Devoxx