Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
This course introduces you to the Transformer architecture and the Bidirectional Encoder Representations from Transformers (BERT) model. You learn about the main components of the Transformer architecture, such as the self-attention mechanism, and how it is used to build the BERT model. You also learn about the different tasks that BERT can be used for, such as text classification, question answering, and natural language inference.This course is estimated to take approximately 45 minutes to complete.