COURSE OUTLINE: A major portion of communication now is through text and any organization has more than 90% of its content in the unstructured form. Natural Language Processing (NLP), an important part in Artificial Intelligence, is one of the important technologies that would help in activities such as classification, retrieving and extraction of information, identifying important documents, etc. Students will gather knowledge in the fundamentals of NLP, methods and techniques and gain skills to use them in practical situations
Overview
Syllabus
Operations on a Corpus.
Probability and NLP.
Machine Translation.
Statistical Properties of Words - Part 01.
Statistical Properties of Words - Part 02.
Statistical Properties of Words - Part 03.
Vector Space Models for NLP.
Document Similarity - Demo, Inverted index, Exercise.
Contextual understanding of text.
Collocations, Dense word Vectors.
Query Processing.
Topic Modeling.
Introduction.
Sequence Learning.
Vector Representation of words.
Co-occurence matrix, n-grams.
SVD, Dimensionality reduction, Demo.
Vector Space models.
Preprocessing.
Introduction to Probability in the context of NLP.
Joint and conditional probabilities, independence with examples.
The definition of probabilistic language model.
Chain rule and Markov assumption.
Out of vocabulary words and curse of dimensionality.
Exercise.
Examples for word prediction.
Generative Models.
Bigram and Trigram Language models -peeking indide the model building.
Naive-Bayes, classification.
Machine learning, perceptron, linearly separable.
Linear Models for Claassification.
Biological Neural Network.
Perceptron.
Perceptron Learning.
Logical XOR.
Activation Functions.
Gradient Descent.
Feedforward and Backpropagation Neural Network.
Why Word2Vec?.
What are CBOW and Skip-Gram Models?.
One word learning architecture.
Forward pass for Word2Vec.
Matrix Operations Explained.
CBOW and Skip Gram Models.
Binay tree, Hierarchical softmax.
Updating the weights using hierarchical softmax.
Sequence Learning and its applications.
ANN as a LM and its limitations.
Discussion on the results obtained from word2vec.
Recap and Introduction.
Mapping the output layer to Softmax.
Reduction of complexity - sub-sampling, negative sampling.
Building Skip-gram model using Python.
GRU.
Truncated BPTT.
LSTM.
BPTT - Exploding and vanishing gradient.
BPTT - Derivatives for W,V and U.
BPTT - Forward Pass.
RNN - Based Language Model.
Unrolled RNN.
Introuduction to Recurrent Neural Network.
IBM Model 2.
IBM Model 1.
Alignments again!.
Translation Model, Alignment Variables.
Noisy Channel Model, Bayes Rule, Language Model.
What is SMT?.
Introduction and Historical Approaches to Machine Translation.
BLEU Demo using NLTK and other metrics.
BLEU - "A short Discussion of the seminal paper".
Introduction to evaluation of Machine Translation.
Extraction of Phrases.
Introduction to Phrase-based translation.
Symmetrization of alignments.
Learning/estimating the phrase probabilities using another Symmetrization example.
mod10lec79-Recap and Connecting Bloom Taxonomy with Machine Learning.
mod10lec80-Introduction to Attention based Translation.
mod10lec81- Neural machine translation by jointly learning to align and translate.
mod10lec82-Typical NMT architecture architecture and models for multi-language translation.
mod10lec77-Encoder-Decoder model for Neural Machine Translation.
mod10lec78-RNN Based Machine Translation.
mod10lec83-Beam Search.
mod10lec84-Variants of Gradient Descend.
mod11lec85-Introduction to Conversation Modeling.
mod11lec86-A few examples in Conversation Modeling.
mod11lec87-Some ideas to Implement IR-based Conversation Modeling.
mod11lec88-Discussion of some ideas in Question Answering.
mod12lec89-Hyperspace Analogue to Language - HAL.
mod12lec90-Correlated Occurence Analogue to Lexical Semantic - COALS.
mod12lec91-Global Vectors - Glove.
mod12lec92-Evaluation of Word vectors.
Taught by
NPTEL-NOC IITM