Overview
Syllabus
[] AI in Production Conference
[] Aparna preferred coffee
[] Takeaways
[] Shout out to Arize team for being a sponsor of the MLOps Community since 2020!
[] Please like, share, and subscribe to our MLOps channels!
[] Evaluation space
[] Chatbots Prevent Misinformation
[] Evaluating AI response based on factual retrieval
[] Balancing eval response and impact on speed
[] Context length, placement, and information recall study
[] GPT-4 excels, prompt iterations affect outcomes
[] Multiple sub-steps and requiring visibility on Application calls
[] Evaluate calls, breakdown, score, and application evaluation
[] Rata classification for effective evaluation Research
[] Benchmarks on Huggingface and Twitter reliability
[] Power of observability and retrieval embeddings
[] Tweaking data points
[] Hot take
[] Bottlenecks and errors from rapid production
Taught by
MLOps.community