Explore word embedding techniques using Star Trek Deep Space Nine scripts in this 43-minute conference talk from Devoxx. Dive into common strategies like Bag of Words, Word2Vec, fastText, GloVe, and BERT. Learn how to apply these techniques in production environments, understand their differences, and see practical examples. Discover the journey from data analysis to machine learning, gaining insights from speakers with extensive experience in solving real-world data challenges. Examine the intersection of linguistic theory, cosine similarity, and locality sensitive hashing. Investigate transformer models, pretrained models, and the engineering aspects of implementing these techniques. Gain knowledge on data governance, training pipelines, model serving, and performance measurement. Access accompanying examples on GitHub to enhance your understanding of word embeddings and their applications in machine learning.
Overview
Syllabus
Intro
Bag of Words
Linguistic Theory
cosine similarity
locality sensitive hashing
Transformer models
Pretrained model
Transformers
Engineering
Data Governance
Training Pipeline
Serving Your Model
Measuring Your Model
Conclusion
Taught by
Devoxx