Overview
Explore locality sensitive hashing (LSH) techniques for accelerating search and similarity comparisons in this EuroPython Conference talk. Learn about two Python implementation methods: a stateless approach for specific data types and a stateful method for diverse data distributions. Discover practical applications of LSH in image search, collaborative filtering, and data science. Gain insights into the underlying concepts, including marginal space, random lines, and dimensionality. Examine Python packages like RPForest and their performance benefits. Understand how LSH is applied to search functionality at Lyst, including integration with PostgreSQL. Conclude with a Q&A session to address specific implementation queries and real-world use cases.
Syllabus
Introduction
Image search
Similar images
Collaborative filtering
Data scientists
How to make it work
Marginal space
Random line
Dimensionality
Questions
How to generalize
Python packages
RPForest
Speed
How we use it
Postgres
Results
QA
Taught by
EuroPython Conference