SRE 2.0: Amplifying Reliability with GenAI

SRE 2.0: Amplifying Reliability with GenAI

Conf42 via YouTube Direct link

operations is a software problem

9 of 36

9 of 36

operations is a software problem

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

SRE 2.0: Amplifying Reliability with GenAI

Automatically move to the next video in the Classroom when playback concludes

  1. 1 intro
  2. 2 preamble
  3. 3 sre 2.0 : amplifying reliability with genai
  4. 4 agenda
  5. 5 quick intro about myself
  6. 6 gartner sre hype cycle
  7. 7 sre
  8. 8 navigating digital transformation: managing ever-growing complexity
  9. 9 operations is a software problem
  10. 10 genai emerges: unveiling the power of next-gen artificial intelligence
  11. 11 unveiling the potential: the capabilityies of llm
  12. 12 navigating challenges: risks associated with llms
  13. 13 addressing model challenges: finding effective solutions
  14. 14 retrieval-augmented generation rag / knowledge bases
  15. 15 llm agents
  16. 16 prompt engineering best practices
  17. 17 prompt engineering properties
  18. 18 sre 2.0
  19. 19 genai in observability
  20. 20 use case - analyze log data to automatically identify root causes of performance issues
  21. 21 genai in sli, slo, and error budgets
  22. 22 use case - recommend optimal error budget allocations based on business priorities and user expectations
  23. 23 genai in system architecture and recovery objectives
  24. 24 use case - predict the impact of different failure scenarios on system availability and performance
  25. 25 genai in release & incident engineering
  26. 26 use case - provide real-time incident response recommendations based on the current situation and historical data
  27. 27 genai in automation
  28. 28 use case - analyze the effectiveness of automation workflows and recommend improvements bases on performance metrics
  29. 29 genai in genai in resilience engineering
  30. 30 use case - automate the execution of chaos experiments based on identified risk factors and failure scenarios
  31. 31 genai in genai in blameless postmortems
  32. 32 use case - analyze historical post-mortem data to identify recurring patterns and trends in incidents
  33. 33 measure progress with business outcomes
  34. 34 best practices
  35. 35 pitfalls to avoid
  36. 36 thank you.

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.