Watch a 45-minute lecture from the Simons Institute where UC Berkeley researcher Orr Paradise introduces Self-Proving models - a novel class of models that can formally verify the correctness of their outputs through Interactive Proof systems. Explore the theoretical foundations and formal definitions of these models, including their per-input guarantees, while learning about the algorithms used for training them and how proof system complexity impacts learning algorithm complexity. Examine practical applications through experiments demonstrating Self-Proving models computing and verifying Greatest Common Divisor calculations. The lecture covers joint research work with Noga Amit, Shafi Goldwasser, and Guy N. Rothblum focused on alignment, trust, watermarking and copyright issues in Large Language Models.
Overview
Syllabus
Models that prove their own correctness
Taught by
Simons Institute