Explore the new session-based dependency management system in Spark Connect, introduced in Apache Sparkâ„¢ 3.5.0, through this 20-minute conference talk by Databricks engineers Akhil Gudesa and Hyukjin Kwon. Dive into the challenges of managing application environments in distributed computing and learn how Spark Connect addresses the limitations of static dependency setups. Discover the power of the Artifact API for dynamic dependency updates during runtime while maintaining strict isolation across sessions. Through practical examples, gain insights on creating, packaging, utilizing, and updating custom isolated environments for seamless execution of both Python and Scala applications. Enhance your understanding of flexible dependency management in distributed computing environments and explore additional resources on Data Lakehouse architecture and Lakehouse Fundamentals Training.
Overview
Syllabus
Dependency Management in Spark Connect: Simple, Isolated, Powerful
Taught by
Databricks