How to Steal ChatGPT's Embedding Size and Other Low-rank Logit Tricks

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore the implications of large language model (LLM) commercialization and API restrictions in this 48-minute talk presented by Matt Finlayson from USC Information Sciences Institute. Discover how, with minimal assumptions about model architecture, significant non-public information can be extracted from API-protected LLMs using a relatively small number of queries. Learn about the softmax bottleneck in modern LLMs and how it can be exploited to obtain full-vocabulary outputs, audit model updates, identify source LLMs, and even uncover hidden model sizes. Examine the empirical investigations that led to estimating OpenAI's gpt-3.5-turbo embedding size at approximately 4096. Consider potential safeguards against these techniques and discuss how these capabilities might contribute to greater transparency and accountability in AI development. Gain insights from Finlayson's background in NLP, computer science, and linguistics as he explores the practical consequences of language model architectural design, from security to generation and learning processes.