TACCL - Guiding Collective Algorithm Synthesis Using Communication Sketches

Overview

Explore a groundbreaking 15-minute conference talk from USENIX's NSDI '23 that introduces TACCL, an innovative tool for optimizing machine learning model training across multiple GPUs and servers. Delve into the challenges of efficient collective communication in distributed training environments and discover how TACCL leverages a novel communication sketch abstraction to guide algorithm synthesis. Learn about TACCL's ability to generate optimized algorithms for various hardware configurations and communication collectives, significantly outperforming existing solutions like the Nvidia Collective Communication Library. Gain insights into the tool's impact on speeding up end-to-end training of popular models such as Transformer-XL and BERT, with impressive performance improvements ranging from 11% to 2.3x for different batch sizes.

Syllabus

NSDI '23 - TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches

Taught by

USENIX

Reviews

Start your review of TACCL - Guiding Collective Algorithm Synthesis Using Communication Sketches

Taught by

Accelerating Collective Communication in Data Parallel Training across Deep Learning Frameworks

Better Together - Jointly Optimizing ML Collective Scheduling and Execution Planning Using SYNDICATE

Acoustic Sensing and Communication Using Metasurface

TopoOpt - Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs

On Modular Learning of Distributed Systems for Predicting End-to-End Latency

Bamboo - Making Preemptible Instances Resilient for Affordable Training of Large DNNs

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.