Explore a conference talk from OSDI '22 that introduces Unity, a groundbreaking system for optimizing distributed Deep Neural Network (DNN) training. Delve into how Unity jointly optimizes algebraic transformations and parallelization using a unified parallel computation graph (PCG). Learn about the system's innovative approach to automatically generating and verifying optimizations, as well as its hierarchical search algorithm for maintaining scalability. Discover Unity's performance improvements over existing DNN training frameworks, with evaluations conducted on seven real-world DNNs using up to 192 GPUs across 32 nodes. Gain insights into the potential impact of Unity on accelerating DNN training and its availability as part of the open-source FlexFlow framework.
Overview
Syllabus
Introduction
Unitys Goal
Parallelization
Parallel Computation Graph
Data Parallelization
PCG Advantages
Techniques
Results
Conclusion
Taught by
USENIX