Leveraging SRE and Observability for Building on LLMs

Overview

Explore the intersection of Site Reliability Engineering (SRE) and observability in the context of Large Language Models (LLMs) in this conference talk from Conf42 Incident Management 2023. Delve into the unique challenges and opportunities presented by LLMs, comparing them to traditional APIs while highlighting their increased unpredictability. Examine the concept of observability and its application to LLM-based systems, including instrumentation techniques and emerging behaviors. Learn about implementing Service Level Objectives (SLOs) for LLM development and gain insights from real-world examples such as Duolingo and Intercom. Discover practical strategies for leveraging SRE principles to build more reliable and observable LLM-powered applications.

Syllabus

intro
preamble
magic of llms
- like apis we know and love
- even more unpredicability
- how do we define "correct"?
about
- what's in the box
- endless feedback loops
why believe me?
- timeline
- goals
laws of building on llms
how do we go forward? instrumentation
instrumentation for llms
emerging behaviors
a truth for llms
service level objectives
slos: a quick definition
slos for developing with llms
from others in the wild
duolingo
intercom
so in the end:
thanks!

Taught by

Conf42

Reviews

Start your review of Leveraging SRE and Observability for Building on LLMs

Taught by

Never Stop Learning.