Mixture-of-Supernets - Improving Weight-Sharing Supernet Training with Architecture-Routed MoE
Overview
Learn about enhancing supernet architecture training through a 35-minute AutoML seminar that introduces the Mixture-of-Supernets formulation. Explore how Mixture-of-Experts (MoE) concepts generate flexible weights for subnetworks, leading to improved Neural Architecture Search (NAS) efficiency and high-quality architectures. Discover the practical applications in constructing efficient BERT and Machine Translation models while meeting user-defined constraints. Join speaker Ganesh Jawahar as he presents this ACL 2024 research that demonstrates significant improvements in retraining time and overall NAS effectiveness.
Syllabus
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed MoE
Taught by
AutoML Seminars