Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Data Mining: Choosing k for MinHashing - Spring 2023

UofU Data Science via YouTube

Overview

Learn advanced data mining concepts in this 21-minute lecture focusing on Min Hashing techniques and statistical foundations. Explore the mathematical principles behind choosing optimal k values for Min Hashing, including Probably Approximately Correct (PAC) learning, Central Limit Theorem, and Chernoff-Hoeffding Inequality. Master the application of these theoretical concepts to obtain accurate Jaccard Similarity estimates through Min Hashing. Delve into practical implementations while understanding the statistical guarantees that make Min Hashing a powerful technique in data mining applications.

Syllabus

Recording Start
Lecture starts
Course Materials Copyright
Announcements
Choosing k for minhashing motivation
PAC
Central Limit Theorem
Chernoff-Hoeffding Inequality
Choosing k for a good estimate of JS
Recording ends

Taught by

UofU Data Science

Reviews

Start your review of Data Mining: Choosing k for MinHashing - Spring 2023

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.