Sub-Linear Algorithms Meets Large Language Models

Overview

Explore the intersection of sub-linear algorithms and large language models in this 41-minute talk by Anshumali Shrivastava from Rice University. Delve into the challenges faced by large language models (LLMs) as their compute, memory, and energy requirements reach trillions of levels per input. Examine the limitations of Large Context Attention and KV Cache blowup, and discover why breaking linear resource barriers is crucial for advancing LLMs. Learn about emerging ideas and successful trends in applying sub-linear algorithms to future LLMs, emphasizing their necessity rather than optionality in pushing the boundaries of language model capabilities.