Completed
[] Benchmarks on Huggingface and Twitter reliability
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
LLM Evaluation: Challenges and Best Practices - MLOps Podcast #210
Automatically move to the next video in the Classroom when playback concludes
- 1 [] AI in Production Conference
- 2 [] Aparna preferred coffee
- 3 [] Takeaways
- 4 [] Shout out to Arize team for being a sponsor of the MLOps Community since 2020!
- 5 [] Please like, share, and subscribe to our MLOps channels!
- 6 [] Evaluation space
- 7 [] Chatbots Prevent Misinformation
- 8 [] Evaluating AI response based on factual retrieval
- 9 [] Balancing eval response and impact on speed
- 10 [] Context length, placement, and information recall study
- 11 [] GPT-4 excels, prompt iterations affect outcomes
- 12 [] Multiple sub-steps and requiring visibility on Application calls
- 13 [] Evaluate calls, breakdown, score, and application evaluation
- 14 [] Rata classification for effective evaluation Research
- 15 [] Benchmarks on Huggingface and Twitter reliability
- 16 [] Power of observability and retrieval embeddings
- 17 [] Tweaking data points
- 18 [] Hot take
- 19 [] Bottlenecks and errors from rapid production