Disaster Recovery: Rebuilding Production in Under 1 Hour Using KOps, ArgoCD and Velero
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a real-life disaster recovery scenario in this conference talk that details how a production environment was rebuilt from scratch in under an hour using KOps, ArgoCD, and Velero. Learn about the operational incident caused by misconfiguration, the challenges faced when standard backup and recovery methods failed, and the crucial role of GitOps and infrastructure as code in the recovery process. Discover the unexpected issues encountered during the 51-minute cluster recreation, including tool malfunctions and outdated disaster recovery guides. Gain insights into the workarounds employed, post-incident improvements, and valuable lessons learned. Understand the importance of disaster recovery planning, the benefits of migrating to GitOps and ArgoCD, and how to streamline deployment processes for faster recovery times.
Syllabus
Introduction
Welcome
About me
About Ada
The Incident
Cluster Management Tool
Disaster Recovery Planning
What is Valero
Making a Decision
Argo CD
Total Outage
The Problem
What Did We Learn
Moving to Argo CD
Single Workflow
Deployment Process
Demo
Questions
Git Repo
Taught by
CNCF [Cloud Native Computing Foundation]