Learn about Nimble, a groundbreaking file format developed and open-sourced by Meta, in this technical conference talk. Discover how Nimble addresses the limitations of traditional formats like Apache ORC and Parquet, particularly when handling wide tables common in machine learning training data preparation. Explore its enhanced efficiency through better parallel decoding capabilities using SIMD and GPUs, along with its flexible and extensible encoding support. Gain insights into Meta's training data preparation workflows, understand Presto Native's integration with Nimble, and learn about its current implementation status at Meta. Examine the ongoing development efforts and future roadmap, with a focus on fostering collaboration opportunities in analytics file formats.
Overview
Syllabus
Nimble, a new file format for large datasets
Taught by
Presto Foundation