Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 40-minute conference talk from the Storage Developer Conference (SDC) 2023 examining innovative solutions for reducing power consumption in generative AI and large language model computations. Dive into the concept of breaking the Von-Neumann bottleneck through the integration of SRAM memory cells with interleaved programmable processors on a single die. Learn about the challenges of running power-intensive language models in datacenters, discover the novel "In-SRAM" computing architecture, and understand recent developments in compressed data types for large-scale deep learning models. Presented by George Williams from GSI Technology, gain insights into mixed precision mathematics and extreme low-bit quantization techniques for model parameters and activations, all aimed at achieving a lower in-silicon power profile for next-generation AI applications.
Syllabus
SDC 2023 - In-SRAM Compute For Generative AI and Large Language Models
Taught by
SNIAVideo