Overview
Syllabus
Machine Learning for Relevance and
The Discovery Problem
How we do it
People who use the term 'secret sauce' know about neither
Talk Outline
Our Topic Classifier
Interest Scores
Saved Session Data
You are all unique snowflakes
How do we know if we've done a good job?
Formula for a Machine Learning Problem
Collect Data • Save session information (described earlier)
Featurize Data
Training the model (numeric optimization)
Classify New Data
Original formulation weight vector is universal, features are user-specific
The Development Cycle
Data bugs (are the worst)
Presentation bias
Ranking vs Normal ML
Statistical Bleeding
Simpsons Paradox
Summary • We collect information about our users' preferences and dispreferences
Prismatic Backend Team
Taught by
Strange Loop Conference