Deploying GenAI on Edge Devices with ExecuTorch - Technical Overview

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Learn advanced techniques for deploying generative AI models on edge devices in this technical talk from Meta's PyTorch Edge team lead Chen Lai. Explore ExecuTorch's innovative approach to addressing edge deployment challenges, including memory optimization and hardware compatibility across diverse platforms. Dive into technical collaborations with Apple, Arm, Qualcomm, and MediaTek that enable deployment of sophisticated language models like LLAMA on mobile devices. Master the process of converting PyTorch models into optimized executable programs using the XTorch ecosystem, including key components like Torchexport and Torchio for compute graph capture and quantization. Understand how Torchchat enables large language model inference across various devices while maintaining compatibility with Hugging Face models. Gain insights into Meta's commitment to advancing edge computing through community-driven innovation and cross-industry collaboration.