Learn how to use cluster analysis, association rules, and anomaly detection algorithms for unsupervised learning.
Overview
Syllabus
Introduction
- Clustering and association
- What you should know
- Using the exercise files
- What is unsupervised machine learning?
- Looking at the data with a 2D scatter plot
- Understanding hierarchical cluster analysis
- Running hierarchical cluster analysis
- Interpreting a dendrogram
- Methods for measuring distance
- What is k-nearest neighbors?
- How does k-means work?
- Which variables should be used with k-means?
- Interpreting a box plot
- Running a k-means cluster analysis
- Interpreting cluster analysis output
- What does silhouette mean?
- Which cases should be used with k-means?
- Finding optimum value for k: k = 3
- Finding optimum value for k: k = 4
- Finding optimum value for k: k = 5
- What the best solution?
- Summarizing cluster means in a table
- Traffic Light feature in Excel
- Line graphs
- How does HDBSCAN work?
- An HDBSCAN example
- Relating clusters to categories statistically
- Relating clusters to categories visually
- Running a multiple correspondence analysis
- Interpreting a perceptual map
- Using cluster analysis and decision trees together
- A BIRCH/two-step example
- A self organizing map example
- The k = 1 trick
- Anomaly detection algorithms
- Using SOM for anomaly detection
- One Class SVM
- Intro to association rules and sequence analysis
- Running association rules
- Some association rules terminology
- Interpreting association rules
- Putting association rules to use
- Comparing clustering and association rules
- Sequence detection
- Next steps
Taught by
Keith McCormick