Faster and Cheaper LLMs with Weight and Key-value Cache Quantization

Faster and Cheaper LLMs with Weight and Key-value Cache Quantization

UofU Data Science via YouTube Direct link

Guest Lecture by Tianyi Zhang: Faster & Cheaper LLMs with Weight and Key-value Cache Quantization

1 of 1

1 of 1

Guest Lecture by Tianyi Zhang: Faster & Cheaper LLMs with Weight and Key-value Cache Quantization

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Faster and Cheaper LLMs with Weight and Key-value Cache Quantization

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Guest Lecture by Tianyi Zhang: Faster & Cheaper LLMs with Weight and Key-value Cache Quantization

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.