Boosting Presto Query Speeds with Hudi's Metadata Table and Clustering Service
Apache Hudi via YouTube
Overview
Learn how to optimize Presto query performance using Apache Hudi's advanced data management features in this 53-minute technical talk. Discover how the clustering service automatically organizes and sizes data files for optimal retrieval, moving beyond traditional time-based ingestion to co-locate frequently accessed data. Explore how Hudi's metadata table eliminates performance bottlenecks by efficiently maintaining file listings for cloud object stores like AWS S3, avoiding expensive listing operations. Master techniques for achieving faster query speeds through the combination of Hudi's clustering and metadata capabilities with Presto, enabling high-performance interactive analytics on large-scale data lakes.
Syllabus
Boost Presto query speeds with Hudi's metadata table & clustering service
Taught by
Apache Hudi