Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the innovative Chatbot Arena platform in this 27-minute conference talk by Wei-Lin Chiang from UC Berkeley and LMSYS. Discover how this open crowdsourced system evaluates large language models (LLMs) using human feedback, allowing users to compare anonymous models side-by-side and vote for superior responses. Learn about the Elo rating system's application in ranking chatbot performance and gain insights into the platform's real-world impact, having processed millions of user requests and collected over 100,000 votes. Delve into the publicly available datasets of user conversations and human preferences, and examine use cases including content moderation model development, safety benchmark creation, instruction-following model training, and challenging benchmark question formulation. For more in-depth information, refer to the associated research paper available at https://arxiv.org/abs/2309.11998.