Overview
Syllabus
Introduction
Overview
Word Segmentation
The apostrophe
What is a word
Tokenization
European Languages
Slides
Problem with tokenization
Rulebased tokenization
Sentence boundary
Subword analysis
What is morphology
Rulebased systems
Language typology
Isolated languages
Gluteative languages
Turkish
English
Other European Languages
IndoEuropean Languages
Germanic Languages
Chinese
Historical Linguistics
Patterns of Languages
Reduplication
Type token curves
Recognizing words of a language
Spelling rules
Finite State Automata
Adjectives
Morphology in English
Finite State Transducer
Einsertion
FST
Taught by
Graham Neubig