Overview
Learn about various distance metrics and their applications in data mining through this recorded university lecture from the University of Utah's Data Science program. Explore fundamental concepts starting with probability-similarity curves and sensitive families before diving into different distance measurements. Master key distance metrics including Jaccard, Euclidean, Manhattan, Lp distances, Mahalanobis, cosine, and angular distances. Understand the mathematical definitions, practical applications, and limitations of each metric type. Gain insights into unit balls, proper usage guidelines, and the relationship between angular distance and Locality-Sensitive Hashing (LSH). The comprehensive coverage provides both theoretical foundations and practical implementation considerations for data mining applications.
Syllabus
Recording starts
Lecture starts
Announcements
Revisit: Prob[candidate]-similarity curves
Revist d1,d2,p1,p2-sensitive family
Distances motivation
Distance/metric definition
Jaccard distance is a metric
Euclidean distance
Manhattan distance
Lp distances
L∞ distance
Lp distances exercise
Lp distances & unit balls
How to not use Lp distances
Mahalanobis distance
Cosine distance
Angular distance
Angular distance LSH
Lecture ends
Taught by
UofU Data Science