High Performance Scalable Support for Big Data Stacks with MPI - Leveraging MPI4Spark and MPI4Dask
OpenFabrics Alliance via YouTube
Overview
Learn about innovative solutions for scaling Big Data platforms through a technical conference talk that explores MPI4Spark and MPI4Dask frameworks. Dive into how these enhanced versions of Spark and Dask overcome network performance bottlenecks by leveraging Message Passing Interface (MPI) libraries on high-speed, low-latency networks like InfiniBand, Omni-Path, and Slingshot. Discover how MPI4Spark utilizes MPI launchers for communication while maintaining worker node isolation through Dynamic Process Management, and explore MPI4Dask's implementation of point-to-point asynchronous I/O communication coroutines for modern HPC clusters with CPU and NVIDIA GPUs. Examine performance evaluations of both frameworks on cutting-edge HPC systems, presented by researchers from The Ohio State University including Aamir Shafi, Dhabaleswar Panda, Jinghan Yao, and Kinan Alattar.
Syllabus
High Performance Scalable Support for Big Data Stacks with MPI
Taught by
OpenFabrics Alliance