New Features of Apache Spark 3.5 - In-Depth Analysis

Overview

Explore the cutting-edge features of Apache Spark 3.5 in this 33-minute talk by Daniel Tenedorio and Xiao Li from Databricks. Dive into Spark Connect's enhanced accessibility, DeepSpeed's AI efficiency integration, and performance optimizations. Learn about new PySpark and SQL capabilities, including array manipulation functions, SQL IDENTIFIER clause improvements, expanded API support, and Arrow-optimized Python UDFs. Gain insights into building scalable, efficient, and robust data-driven applications using the latest advancements in big data processing and AI. After the talk, access additional resources like the Big Book of Data Engineering and The Data Team's Guide to the Databricks Lakehouse Platform for further exploration.