Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

CMU Multilingual NLP 2020 - Data Augmentation for Machine Translation

Graham Neubig via YouTube

Overview

Explore data augmentation techniques for machine translation in this 25-minute lecture from CMU's Multilingual Natural Language Processing course. Delve into methods utilizing monolingual data and high-resource languages, covering topics such as back translation, multilingual training approaches, and pivoting strategies. Learn about iterative back-translation, English-HRL augmentation, and dictionary-based techniques. Gain insights into word alignment and word-by-word data augmentation with reordering. Understand the challenges of low-resource machine translation and discover practical solutions to enhance translation quality in resource-constrained scenarios.

Syllabus

Intro
Data Challenges in Low-resource MT
Multilingual Training Approaches
Data Augmentation 101: Back Translation
Back Translation Idea
How to Generate Translations
Iterative Back-translation
Back Translation Issues
English - HRL Augmentation
Augmentation via Pivoting
Data w/ Various Types of Pivoting
Monolingual Data Copying
Dictionary-based Augmentation
An Aside: Word Alignment
Word-by-word Data Augmentation
Word-by-word Augmentation w/ Reordering

Taught by

Graham Neubig

Reviews

Start your review of CMU Multilingual NLP 2020 - Data Augmentation for Machine Translation

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.