Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

LLMOps: OpenVino Toolkit Quantization 4int LLama 3.2 3B and Inference on CPU

The Machine Learning Engineer via YouTube

Overview

Learn how to convert the LLAMA3.2 3 Billion parameter model to OpenVino IR format and quantize it to 4-bit integer precision. Follow along as the process of model conversion and quantization is demonstrated step-by-step. Discover how to perform inference on a CPU using Chain of Thought (CoT) prompts with the optimized model. Access the accompanying Jupyter notebook for hands-on practice and deeper understanding of the LLMOps techniques covered in this 26-minute tutorial on data science and machine learning.

Syllabus

LLMOps: OpenVino Toolkit quantization 4int LLama3.2 3B, Inference CPU #datascience #machinelearning

Taught by

The Machine Learning Engineer

Reviews

Start your review of LLMOps: OpenVino Toolkit Quantization 4int LLama 3.2 3B and Inference on CPU

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.