The NLP Task Effectiveness of Long-Range Transformers

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore the effectiveness of long-range Transformer variants in natural language processing tasks through this insightful conference talk from EACL 2023. Delve into a comprehensive study comparing seven Transformer model variants across five challenging NLP tasks and seven datasets. Gain valuable insights into the advantages and previously unrecognized drawbacks of modified attention mechanisms in long-range Transformers. Discover how these models perform in content selection and query-guided decoding, while also learning about their limitations in attending to distant tokens and accumulating approximation errors. Understand the impact of pretraining and hyperparameter settings on model performance, and explore various methods for investigating attention behaviors beyond traditional metric scores. Enhance your knowledge of state-of-the-art NLP models and their real-world applications in this 12-minute presentation by researchers from the Center for Language & Speech Processing (CLSP) at Johns Hopkins University.