Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Training High Quality Embedding Models for Production Vector Search

OpenSource Connections via YouTube

Overview

Learn essential techniques for training high-quality embedding models for production vector search in this 43-minute conference talk from Haystack EU 2024. Discover practical strategies for optimizing model training using historic search-result interactions and business objectives. Explore critical aspects of data quality management, including query-document relationships, duplicate handling, and query coverage. Master methods for leveraging existing search results to maintain behavior consistency and model regularization. Gain insights into adapting recommendation system strategies, including bias terms, linear re-ranking, and query-result interaction matrices. Examine key considerations in loss functions, base model selection for fine-tuning, and crucial hyperparameter optimization. Understand production-focused training approaches, including vector database optimization, vector fusing, and binary/truncation-aware training. Learn efficient updating techniques without re-indexing requirements and strategies for transitioning from offline to online A/B testing with novelty-based splits. Presented by Robertson Taylor, a solutions engineer at Marqo specializing in high-volume data solutions, fine-tuned embedding models, and scalable retrieval solutions for product retrieval, content classification, and real-time recommendations.

Syllabus

Haystack EU 2024 - Robertson Taylor:Train high quality embedding models for production vector search

Taught by

OpenSource Connections

Reviews

Start your review of Training High Quality Embedding Models for Production Vector Search

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.