Accelerating Transformers with Hugging Face Optimum and Infinity
MLOps World: Machine Learning in Production via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the acceleration of Transformer models in this comprehensive talk from MLOps World: Machine Learning in Production. Discover Hugging Face's efforts to enhance prediction speed for Transformer models, focusing on two key tools: Optimum and Infinity. Learn about Optimum, an open-source library designed to optimize Transformer training and deployment on specific hardware. Gain insights into Infinity, a containerized solution that achieves millisecond-scale latencies in production environments. Benefit from the expertise of Lewis Tunstall and Philipp Schmid, Machine Learning Engineers at Hugging Face, as they discuss strategies for balancing model accuracy with performance requirements in real-world NLP applications. Understand how these tools can help overcome challenges related to model size and speed, enabling more efficient implementation of state-of-the-art NLP models in various business contexts.
Syllabus
Accelerating Transformers with Hugging Face Optimum and Infinity
Taught by
MLOps World: Machine Learning in Production