Extending PyTorch for Custom Compiler Targets - Accelerator Hardware Integration

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Learn how to annotate PyTorch code with custom metadata for optimizing model inference on specialized hardware accelerators in this conference talk from the Toronto Machine Learning Series. Discover techniques presented by Groq Compiler Engineer Arash Taheri-Dezfouli for injecting and preserving arbitrary information in PyTorch graphs to improve performance on Language Processing Units (LPUs). Explore methods for maintaining graph semantics while adjusting workload mapping, implementing custom data types, and persisting precision information through PyTorch's compilation pipeline. Master generalizable approaches for annotating PyTorch models at various granularities to maximize inference efficiency on custom hardware targets, working around limitations in standard graph export systems like TorchScript, ONNX, and torch.compile.