Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Running Llama Models with PyTorch and KleidiAI on Arm Servers - A Step-by-Step Tutorial

Arm Software Developers via YouTube

Overview

Learn to deploy and run Llama 3.1 and 3.2 language models in a hands-on tutorial that demonstrates implementation on Arm-based AWS instances. Master the process of requesting Meta Llama model access, setting up an Arm-based AWS EC2 instance, and installing necessary software components. Explore model quantization techniques for Llama 3.1, verify model functionality, and implement a web interface using Torchchat backend with Streamlit frontend. Discover how to adapt the implementation for Llama 3.2, with all steps accelerated by KleidiAI optimization. Follow comprehensive instructions for creating an interactive chatbot experience, complete with practical demonstrations of PyTorch integration and web-based user interface deployment.

Syllabus

Intro
Request access to the Meta Llama models
Create the Arm-based AWS EC2 instance
Update and install the required software
Download and quantize the Llama 3.1 model
Test that the model is working
Run the Torchchat backend and Streamlit frontend
Modify the Learning Path to run Llama 3.2
Outro

Taught by

Arm Software Developers

Reviews

Start your review of Running Llama Models with PyTorch and KleidiAI on Arm Servers - A Step-by-Step Tutorial

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.