Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Evaluating World Models in Generative AI - A Critique of AI Intelligence Assessment

Discover AI via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn about groundbreaking research from Harvard, Cornell, and MIT that challenges conventional AI evaluation methods in this 30-minute video presentation. Explore how traditional metrics like next-token prediction may inadequately assess AI models' true understanding of language structures. Dive into the application of the Myhill-Nerode theorem for developing new evaluation frameworks that measure sequence compression and distinction capabilities. Follow along through real-world examples including NYC navigation, Othello gameplay, and logic puzzles that demonstrate how AI models can appear competent while lacking coherent internal world models. Examine the timeline of AI investment, emergence theories, world modeling concepts, and Chomsky's hierarchy before delving into finite automaton theory and new evaluation metrics. Consider critical findings about AI limitations, explore counterarguments, and reflect on questions of AI trustworthiness and sufficiency for real-world applications.

Syllabus

AI tech companies invest US 1T
AI Emergence?
AI World Models
Noam Chomsky Hierachy
Finite Automaton
NEW IDEA for AI framework
LLM mimicking sequences only?
Myhill-Nerode theorem
2 new evaluation metrices LLM
DFA of Manhattan
LLM internal representation
Main Findings
Counterarguments
Trust in AI?
AI is good enough?

Taught by

Discover AI

Reviews

Start your review of Evaluating World Models in Generative AI - A Critique of AI Intelligence Assessment

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.