Overview
Syllabus
Intro
What do we want to know about words?
A Manual Attempt: WordNet
An Answer (?): Word Embeddings!
How to Train Word Embeddings?
Distributional vs. Distributed Representations
Count-based Methods
Distributional Representations (see Goldberg 10.4.1) • Words appear in a context
Context Window Methods
Count-based and Prediction-based Methods
Glove (Pennington et al. 2014)
What Contexts?
Types of Evaluation
Non-linear Projection • Non-linear projections group things that are close in high- dimensional space eg. SNEA-SNE (van der Masten and Hinton 2008) group things that give each other a high probability according to a Gaussian
t-SNE Visualization can be Misleading! (Wattenberg et al. 2016)
Intrinsic Evaluation of Embeddings (categorization from Schnabel et al 2015)
Extrinsic Evaluation: Using Word Embeddings in Systems
How Do I Choose Embeddings?
When are Pre-trained Embeddings Useful?
Limitations of Embeddings
Sub-word Embeddings (1)
Taught by
Graham Neubig