Identifying Hidden Dependencies

Identifying Hidden Dependencies

USENIX via YouTube Direct link

Intro

1 of 51

1 of 51

Intro

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Identifying Hidden Dependencies

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Big data is operationally complex.
  3. 3 Observability is evolving quickly.
  4. 4 Two dozen engineers build Honeycomb.
  5. 5 We make systems humane to run
  6. 6 by ingesting telemetry
  7. 7 enabling data exploration
  8. 8 and empowering engineers.
  9. 9 We deploy with confidence.
  10. 10 Continuous delivery is an investment.
  11. 11 Continuity of operations even more so.
  12. 12 Stable platforms empower innovation.
  13. 13 but stateful services can be scary.
  14. 14 We need velocity and reliability.
  15. 15 Quantify reliability.
  16. 16 Identify potential areas of risk.
  17. 17 Design experiments to probe risk.
  18. 18 Prioritize addressing risks.
  19. 19 How broken is "too broken"?
  20. 20 Service Level Objectives define success.
  21. 21 SLOs are common language.
  22. 22 Think in terms of events in context.
  23. 23 HTTP Code 200? Latency 100ms?
  24. 24 Set a target Service Level Objective.
  25. 25 Use a window and target percentage.
  26. 26 We keep SLOs at Honeycomb.
  27. 27 We store incoming telemetry.
  28. 28 Alerts usually evaluate every minute.
  29. 29 Often, queries come back under 10s.
  30. 30 Error budget: allowed unavailability
  31. 31 Is it safe to do this risky experiment?
  32. 32 Data persistence is tricky.
  33. 33 Experiment using error budgets.
  34. 34 Infrequent changes.
  35. 35 Long-running processes.
  36. 36 Data integrity and consistency.
  37. 37 Delicate failover dances
  38. 38 Restart one server & service at a time.
  39. 39 Bugs are shallow with more eyes.
  40. 40 Monitor for changes using SLIs.
  41. 41 Debug with observability.
  42. 42 Test the telemetry too!
  43. 43 Verify fixes by repeating.
  44. 44 Continuously verify to stop regression.
  45. 45 Save money with flexibility.
  46. 46 Hypothesize, test, and learn.
  47. 47 Celebrate successes and failures.
  48. 48 Be more reliable & scalable.
  49. 49 Sleep easily at night.
  50. 50 You can do this too, step by step.
  51. 51 Read our blog! hny.co/blog

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.