Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for evaluating language models in this 23-minute lightning talk from the AI in Production Conference. Delve into the metrics and datasets available for assessment, and examine the difficulties of continuous evaluation in production environments. Learn about common pitfalls to avoid and gain insights from Matthew Sharp, author of "LLMs in Production" and a seasoned professional with over a decade of experience in ML/AI and deploying models to production. Discover the importance of contributing to public evaluation datasets and join the call for a community-wide effort to reduce harmful bias in language models. Gain valuable takeaways for improving language model evaluation practices in your own projects or organizations.
Syllabus
Evaluating Language Models // Matthew Sharp // AI in Production Conference Lightning Talk
Taught by
MLOps.community