Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Presto Optimization with Distributed Caching on Data Lake

Presto Foundation via YouTube

Overview

Learn how to optimize Presto performance through distributed caching in this technical talk that addresses common challenges faced when working with cloud storage systems like S3. Explore solutions for slow query performance and high API costs through detailed explanations of distributed caching design patterns and real-world implementations. Discover advanced techniques including segmented data file caching, soft-affinity scheduler policies, cache filtering, TTL, and customized eviction strategies. Examine case studies from major technology companies like Meta, Uber, ByteDance, and Newsbreak to understand how they successfully implemented caching to optimize interactive queries, maximize hit rates, reduce cloud storage costs, and improve query performance. Master practical implementation strategies for setting up caching systems and measuring performance improvements using TPC-DS benchmark results.

Syllabus

Presto Optimization with Distributed Caching on Data Lake - Hope Wang & Beinan Wang, Alluxio

Taught by

Presto Foundation

Reviews

Start your review of Presto Optimization with Distributed Caching on Data Lake

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.