Spark RAPIDS ML - GPU Accelerated Distributed Machine Learning in Spark Clusters
Databricks via YouTube
Overview
Discover the power of GPU acceleration for distributed machine learning in Spark clusters through this 38-minute conference talk by Erik Ordentlich and Jinfeng Li from NVIDIA. Learn about Spark RAPIDS ML, an open-source Python package that enables GPU acceleration of Spark distributed machine learning applications. Explore how this package, built upon the RAPIDS cuML library, implements GPU-accelerated versions of classical ML algorithms for regression, classification, clustering, and dimensionality reduction. Understand the benefits of Spark RAPIDS ML, including its compatibility with Spark MLlib DataFrame API and impressive benchmark results showing up to 100x speedup and 50x cost savings over baseline Spark MLlib in compute-intensive scenarios. Gain insights into the evolution of Spark MLlib and how Spark RAPIDS ML leverages modern computing accelerators like GPUs to enhance performance.
Syllabus
Spark RAPIDS ML: GPU Accelerated Distributed ML in Spark Clusters
Taught by
Databricks