Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Nearest Neighbor Normalization for Improving Multimodal Retrieval

Discover AI via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn about Nearest Neighbor Normalization (NNN), a post-training optimization technique for multimodal AI models, in this 16-minute research presentation from Stanford University and MIT. Explore how NNN addresses the hubness problem in high-dimensional embedding spaces by calculating and adjusting bias scores for retrieval candidates. Understand the technical implementation details of this efficient, training-free method that operates in sublinear time using k-nearest neighbors. Discover how NNN improves retrieval accuracy across multiple datasets and models like CLIP, BLIP, and ALBEF, while also reducing gender biases in retrieval tasks. Follow along with practical examples, detailed method explanations, and experimental results that demonstrate NNN's superiority over previous normalization approaches like DBNorm and QBNorm. Gain insights into the future research directions and access the implementation through the provided GitHub repository.

Syllabus

Nearest Neighbor Normalization for MMM
BLIP, ALBEF, SigLIP, CLIP
New solutions to old problems
Simple example
NNN method explained
Reference database Multimodal
Results
Future Research
GitHub repo code
Deep dive in Hubness

Taught by

Discover AI

Reviews

Start your review of Nearest Neighbor Normalization for Improving Multimodal Retrieval

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.