Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a conference talk from USENIX ATC '19 that delves into the NodeKernel architecture, a novel distributed storage solution designed to efficiently handle temporary data exchange between tasks in data processing frameworks. Learn about the fusion of file system and key-value semantics in a common storage kernel, and how it leverages modern networking and storage hardware for high performance and cost-efficiency. Discover how NodeKernel provides hierarchical naming, high scalability, and near bare-metal performance for a wide range of data sizes and access patterns characteristic of temporary data. Examine the benefits of Crail, the concrete implementation of NodeKernel, which utilizes RDMA networking with tiered DRAM/NVMe-Flash storage to improve NoSQL workload and Spark application performance significantly. Understand how Crail's storage approach across NVMe Flash and DRAM tiers reduces storage costs compared to DRAM-only systems.
Syllabus
Intro
Temporary Intermediate Data
Shortcomings of Temporary Data Storage
Temporary data distribution
Temporary Data Storage Requirements
Nodekernel: Node Types
Crail: System Architecture
Evaluation
Small and Large Data Sets
Flexible Deployment
DRAM / Flash Ratio
Conclusions
Open Source
Crail Data Plane
Taught by
USENIX