Neural Nets for NLP 2019: Sentence and Contextualized Word Representations
Graham Neubig via YouTube
Overview
Syllabus
Intro
Goal for Today
Where would we need/use Sentence Representations?
Sentence Classification
Paraphrase Identification (Dolan and Brockett 2005) • Identify whether A and B mean the same thing
Textual Entailment (Dagan et al. 2006, Marelli et al. 2014)
Model for Sentence Pair Processing
Types of Learning
Plethora of Tasks in NLP
Rule of Thumb 2
Standard Multi-task Learning
Thinking about Multi-tasking, and Pre-trained Representations
General Model Overview
Language Model Transfer
End-to-end vs. Pre-training
Context Prediction Transfer (Skip-thought Vectors) (Kiros et al. 2015)
Paraphrase ID Transfer (Wieting et al. 2015)
Large Scale Paraphrase Data (ParaNMT-50MT) (Wieting and Gimpel 2018)
Entailment Transfer (InferSent) (Conneau et al. 2017)
Bi-directional Language Modeling Objective (ELMO)
Masked Word Prediction (BERT)
Taught by
Graham Neubig