All You Need to Know on Multilingual Sentence Vectors - 1 Model, 50+ Languages
James Briggs via YouTube
Overview
Explore the world of multilingual sentence vectors in this comprehensive 40-minute video tutorial. Learn how to create language-agnostic vector representations of text that enable cross-lingual comparisons. Discover the challenges of multilingual models and innovative approaches to overcome them. Dive into multi-task training techniques, including mUSE, and understand the concept of multilingual knowledge distillation. Follow a visual walkthrough of the entire process, from parallel data preparation to model evaluation. Master the art of choosing and initializing student models, working with ParallelSentencesDataset, and fine-tuning with appropriate loss functions. Gain practical insights into developing your own multilingual sentence transformers and leveraging high-performing pretrained models for various applications such as semantic search and topic modeling across multiple languages.
Syllabus
Intro
Multilingual Vectors
Multi-task Training mUSE
Multilingual Knowledge Distillation
Knowledge Distillation Training
Visual Walkthrough
Parallel Data Prep
Choosing a Student Model
Initializing the Models
ParallelSentencesDataset
Loss and Fine-tuning
Model Evaluation
Outro
Taught by
James Briggs