Is Big Data Performance Reproducible in Modern Cloud Networks?

Overview

Explore a 21-minute conference talk from USENIX NSDI '20 that delves into the challenges of reproducing big data performance in modern cloud networks. Examine the impact of network variability on cloud-based big-data workloads through extensive data collection from commercial and research clouds. Discover how quality-of-service mechanisms and service provider policies can exacerbate variability issues. Learn about the significant slowdowns and lack of predictability in big-data workloads, even when using state-of-the-art experimentation techniques. Gain valuable insights into reducing performance volatility and improving experiment repeatability in cloud environments. Understand the importance of considering variability in cloud performance evaluations and the need for running more experiments to achieve reliable results.

Syllabus

Intro
Big data infrastructure in the cloud
How do we assess performance in the cloud ?
Cloud performance is variable
Variability is disconsidered in performance evaluations
Experiment Design (1) - Measuring the Cloud
Main Findings - Modern Cloud Networks
Experiment Design (2) -- Reproducing App Performance
How to run repeatable experiments?
TL;DR: run more experiments

Taught by

USENIX

Reviews

Start your review of Is Big Data Performance Reproducible in Modern Cloud Networks?

Taught by

Understanding RDMA Microarchitecture Resources for Performance Isolation

Skyplane - Optimizing Transfer Cost and Throughput Using Cloud-Aware Overlays

Collie - Finding Performance Anomalies in RDMA Subsystems

Invisinets - Removing Networking from Cloud Networks

Disaggregating Stateful Network Functions

Configanator - A Data-driven Approach to Improving CDN Performance

From Data to Insights: 10 Best Data Analysis Courses for 2024

Never Stop Learning.