Converting T5 Large Models to ONNX Format and 8-bit Quantization with MLflow and Optimum
The Machine Learning Engineer via YouTube
Overview
Learn how to convert a T5 Large Model to ONNX format and perform 8INT quantization using the Huggingface Transformers Optimum library in this 48-minute technical video. Follow along with a practical demonstration of converting a fine-tuned text summarization model, with all steps tracked in MLFlow. Access the complete implementation through the provided Jupyter notebook to master model optimization techniques for improved deployment efficiency. Gain hands-on experience with MLOps practices while exploring the intersection of model conversion, quantization, and experiment tracking.
Syllabus
MLOps MLFlow: Convert to ONNX and quantize to 8Int with Optimum #datascience #machinelearning
Taught by
The Machine Learning Engineer