Debugging Complex Kubernetes Incidents - When It's Not DNS
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Syllabus
Intro
Metries service errors during rollouts
Applications involved
DNS setup
Too many queries at startup?
Networking issues?
Let's test with network optimized instances
What about bigger instances?
VPC Flow Logs
Zoom on ingress flows to old IP
What about egress?
Routing on nodes
Stable state
What about traffic to old IP?
Let's simulate
Reverse Path filtering
2 questions
RPC setup
DNS propagation time during Rollouts
Reconnection differences
Lessons Learned
Taught by
CNCF [Cloud Native Computing Foundation]