Overview
Explore a groundbreaking conference talk from USENIX ATC '23 that introduces Legion, an innovative system designed to accelerate billion-scale Graph Neural Network (GNN) training using multi-GPU setups. Delve into the challenges faced by current cache-based GNN systems when dealing with massive graphs common in industry applications like e-commerce product recommendations and financial risk control. Discover Legion's three key innovations: a hierarchical graph partitioning mechanism for improved multi-GPU cache performance, a unified multi-GPU cache to reduce PCIe traffic, and an automatic cache management system that optimizes training throughput based on hardware specifications. Learn how Legion outperforms state-of-the-art cache-based systems and enables efficient training of billion-scale GNNs on a single machine, as demonstrated through evaluations across various GNN models and datasets.
Syllabus
USENIX ATC '23 - Legion: Automatically Pushing the Envelope of Multi-GPU System for Billion-Scale...
Taught by
USENIX