Overview
Syllabus
Introduction
Discussion on benchmarking incentives with references to Kaggle.
Nuanced challenges of language model overfitting.
How to effectively choose AI models using leaderboards.
Advice on selecting models for company use.
Importance of having a robust model evaluation framework.
How individuals can contribute to AI benchmarking.
Discussion on specialized vs. generalized model performance.
Insights into the complexities of benchmarking AI agents.
Real-world applications and limitations of AI agents.
Percy reflects on his TED talk regarding open vs. closed source models.
Introduction to TogetherAI and its mission for open AI development.
Combining AI with music creation.
Taught by
Weights & Biases