Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Advancing GPU Analytics with RAPIDS Accelerator for Apache Spark and Alluxio

Databricks via YouTube

Overview

Discover how to leverage RAPIDS Accelerator for Apache Spark and Alluxio to advance GPU analytics in this 27-minute talk from Databricks. Learn about RAPIDS, a set of open source libraries enabling GPU-aware scheduling and memory representation for analytics and AI, and how Apache Spark 3.0 utilizes it for GPU computing. Explore the need for accelerating data access with Alluxio to complement the massive parallelism of GPUs. Gain insights into using Alluxio and Spark with RAPIDS Accelerator on NVIDIA GPUs without application changes. Delve into trends in analytics and AI, modern data pipelines, data orchestration for GPU clusters, benchmarking results, and key configurations for Alluxio and RAPIDS Accelerator. Understand concepts such as data accessibility, filesystem namespace, metadata locality, and asynchronous data operations in the context of GPU-accelerated analytics.

Syllabus

Intro
TRENDS FOR ANALYTICS AND AIDATA PIPELINES
A MODERN DATA ANALYTICS OR AI PIPELINE
DATA ORCHESTRATION FOR GPU CLUSTERS WITHIN A SINGLE DATACENTER OR CLOUD
NVIDIA Brings GPU Acceleration to Apache Spark
Alluxio and RAPIDS Accelerator for Apache Spark
BENCHMARKING ENVIRONMENT
BENCHMARKING RESULTS
ALLUXIO CONFIGURATION
RAPIDS ACCELERATOR CONFIGURATION
DATA ACCESSIBILITY
ALLUXIO FILESYSTEM NAMESPACE
METADATA LOCALITY
ASYNCHRONOUS DATA OPERATIONS
References

Taught by

Databricks

Reviews

Start your review of Advancing GPU Analytics with RAPIDS Accelerator for Apache Spark and Alluxio

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.