Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Optimizing Large Language Model Inference for Arm CPUs

EDGE AI FOUNDATION via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Watch a technical talk exploring optimization strategies for running Large Language Models (LLMs) on Arm CPUs, presented by Principal Engineer Dibakar GOPE from Arm's Machine Learning & AI division. Learn about advanced techniques for accelerating LLM inference on commodity Arm processors, focusing on matrix multiplication optimizations with low numerical precision and compression methods to minimize memory traffic. Discover how to leverage SDOT and SMMLA instructions in combination with 4-bit quantization schemas to enable efficient LLM deployment across smartphones and edge devices, making advanced AI capabilities more accessible to billions of compact computing devices.

Syllabus

GenAI on the Edge Forum: Optimizing Large Language Model (LLM) Inference for Arm CPUs

Taught by

EDGE AI FOUNDATION

Reviews

Start your review of Optimizing Large Language Model Inference for Arm CPUs

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.