Improving Apache Spark Application Processing Time - Configuration and Optimization Techniques
Databricks via YouTube
Overview
Syllabus
Intro
About CSI Group (Cloud Security Intelligence)
Application Architecture and Overview
Input Architecture
Read Phase: Spark Data Source Overview
Spark Data Source Implementation
Partitioning Strategies
Dynamic number of tasks
Custom Spark Data Source - Summary
Optimal Number of Partitions
Garbage Collection - Analysis
Garbage First (GI) GC
Garbage Collection - Summary
Taught by
Databricks