Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

MIT HAN Lab via YouTube

Overview

Explore the groundbreaking research presented in this 19-minute conference talk video from MLSys 2024, featuring the Best Paper "AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration." Delve into the innovative approach developed by researchers from MIT HAN Lab for compressing and accelerating Large Language Models (LLMs). Learn about the Activation-aware Weight Quantization (AWQ) technique and its potential impact on improving the efficiency of LLMs. Gain insights into the methodology, results, and implications of this cutting-edge work in machine learning systems. Access additional resources, including the project website, full paper, and code repository, to further understand and potentially implement the AWQ technique in your own projects.

Syllabus

MLSys'24 Best Paper - AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Taught by

MIT HAN Lab

Reviews

Start your review of AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.