How the OOM-Killer Deleted My Namespace, and Other Kubernetes Tales

How the OOM-Killer Deleted My Namespace, and Other Kubernetes Tales

CNCF [Cloud Native Computing Foundation] via YouTube Direct link

Back to the incident

29 of 33

29 of 33

Back to the incident

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

How the OOM-Killer Deleted My Namespace, and Other Kubernetes Tales

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Datadog
  3. 3 Symptoms
  4. 4 Investigation
  5. 5 Deletion call, 4d before Audit logs for the namespace
  6. 6 Spinnaker deploys (v1)
  7. 7 Helm 3 deploys (v2)
  8. 8 Big difference
  9. 9 What happened?
  10. 10 Namespace Controller logs Virtual
  11. 11 Events so far
  12. 12 Metrics-server setup
  13. 13 Metrics-server deployment
  14. 14 Full chain of events
  15. 15 Key take-away Apiservice extensions are great but can impact your cluster
  16. 16 Context
  17. 17 Runtime is down?
  18. 18 CNI status
  19. 19 Containerd goroutine dump Blocked goroutines?
  20. 20 Seems CNI related
  21. 21 What about Delete?
  22. 22 CNI plugin
  23. 23 The root cause
  24. 24 What we know
  25. 25 Apiserver requests
  26. 26 Illustration
  27. 27 What about label filters?
  28. 28 Informers instead of List How do informers work?
  29. 29 Back to the incident
  30. 30 Nodegroup controller?
  31. 31 How did it work?
  32. 32 What we learned
  33. 33 Conclusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.