Large-Scale K8s Cluster Operation and Management
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Explore large-scale Kubernetes cluster operation and management in this conference talk by Lv Jiangzhao from JD. Gain insights into JDOS (JD Datacenter Operation System), a massive container cluster system based on Kubernetes that runs across JD's global datacenters. Learn how JD manages millions of containers in production with only two full-time SREs. Discover strategies for node component detection and management, master component fault detection and failure recovery (especially for etcd nodes), and techniques to significantly reduce apiserver requests for building larger Kubernetes clusters. Understand the challenges and solutions involved in operating Kubernetes at scale in a real-world enterprise environment.
Syllabus
Large-Scale K8s Cluster Operation and Management - Lv Jiangzhao, JD
Taught by
CNCF [Cloud Native Computing Foundation]