Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the critical challenges and solutions for real-time monitoring of autonomous agents powered by Large Language Models (LLMs) in this 31-minute conference talk by Alexandre Variengien and Diego Dorn from EffiSciences. Delve into the unique issues faced when deploying LLM agents in real-world scenarios, including indirect prompt injection and strategic deception. Learn about the importance of developing robust monitoring systems capable of preemptively addressing unforeseen failures. Discover the proposed community-driven approach to refine LLM agent supervision, featuring a shared database of failure cases and a unified trace format. Gain insights into two key metrics for evaluating monitoring systems: accuracy on held-out anomalies and proficiency in detecting early warning signs. Join the discussion on shaping the future of agent supervision and contribute to the collective effort in anticipating and mitigating unexpected challenges in AI deployment.