Completed
Optimization: Aggregate Keys in Batches
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Accelerating Data Processing in Spark SQL with Pandas UDFs - Optimization Techniques
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Optimization Tricks
- 3 What are Pandas UDFs?
- 4 Development tips and tricks
- 5 Modeling at Quantcast
- 6 Example Problem
- 7 Naive approach: Use Spark SOL
- 8 Optimization: Use Pandas UDFs for Looping
- 9 Optimization: Aggregate Keys in Batches
- 10 Optimization: Inverted Indexes
- 11 Optimization: Use python libraries
- 12 Optimization: Summary