Challenges in Making Large Language Models Safe and Robust

Overview

Explore the root causes of safety concerns and attacks on large language models in this comprehensive lecture from the Simons Institute's Special Year on Large Language Models and Transformers, Part 1 Boot Camp. Delve into various defense strategies and their effectiveness as Aditi Raghunathan from Carnegie Mellon University uses a simple illustrative problem to explain complex concepts. Gain insights into the broader literature on safety and robustness in machine learning while understanding the challenges faced in making LLMs safe and robust.