Explore a groundbreaking programming model for efficient spatial accelerator design in this 17-minute conference talk from PLDI 2024. Delve into Allo, a composable approach that decouples hardware customizations from algorithm specifications, enabling more effective design of complex, high-performance accelerator architectures. Learn how Allo preserves hierarchical program structure, facilitates holistic optimizations across function boundaries, and outperforms existing high-level synthesis tools and accelerator design languages. Discover its potential through comprehensive experiments on HLS benchmarks and realistic deep learning models, including impressive performance gains for the GPT2 model compared to NVIDIA A100 GPU. Gain insights into the future of hardware acceleration for emerging applications as technology scaling benefits diminish.
Overview
Syllabus
[PLDI24] Allo: A Programming Model for Composable Accelerator Design
Taught by
ACM SIGPLAN