Explore the challenges and solutions of operating a large-scale Mesos infrastructure across multiple datacenters in this keynote presentation. Gain insights into Criteo's journey of deploying and managing over 600 Mesos agents to handle billions of daily user requests with low latency. Discover how the company addressed key operational concerns including configuration management, application secrets, logging, service discovery, networking, metrology, and SLAs. Learn from the experiences shared in setting up a production-grade Mesos infrastructure from scratch and understand the future expectations for this system. Benefit from the speaker's expertise in automating deployment, troubleshooting, and supporting developers on both Mesos clusters and legacy Windows/C# infrastructure.
Overview
Syllabus
Introduction
Company Overview
Transition
Platform
Methods
Revision
Automation
Predictability
Service Discovery
Observability
Networking
Incidents
Collaboration
Outro
Taught by
Linux Foundation