SliceGPT Explained - Compressing Large Language Models

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore a 55-minute session featuring Saleh Ashkboos, a PhD student at ETH Zurich, as he delves into SliceGPT, a novel approach for compressing large language models. Learn how this technique can remove up to 25% of model parameters while maintaining high zero-shot task performance for models like LLAMA2-70B, OPT 66B, and Phi-2. Discover the intricacies of the SliceGPT method, which involves deleting rows and columns to achieve significant compression without substantial performance loss. Gain insights into accelerating deep neural network training and developing systems for large-scale graph processing. Access additional resources, including the original research paper, AI research newsletters, and blogs on AI deployment. Connect with the Unify community through various platforms to stay updated on AI optimization, LLM compression, and related topics.