Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Scalably Understanding AI with AI - Using AI Systems for Model Behavior Analysis

Simons Institute via YouTube

Overview

Watch a 46-minute lecture from UC Berkeley professor Jacob Steinhardt at the Simons Institute exploring how AI can be used to understand and analyze other AI systems. Learn about behavior elicitation techniques that use investigator agents to automatically prompt specific model behaviors through reinforcement learning and supervised fine-tuning. Discover improved methods for neuron description that generate high-quality natural language explanations of neural network activations using 8B-parameter open-weight models. Explore practical applications through the Monitor observability interface to understand puzzling model behaviors, including investigating why language models make certain numerical comparison errors. Gain insights into the complex pipeline from training data to learned representations and observed behaviors in AI systems, with a focus on using AI tools to better understand and steer these systems.

Syllabus

Scalably Understanding AI with AI

Taught by

Simons Institute

Reviews

Start your review of Scalably Understanding AI with AI - Using AI Systems for Model Behavior Analysis

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.