Pre-training Mixtral MoE Model with SageMaker HyperPod - Fine-Tuning and Continued Pre-Training

Overview

Learn how to pre-train, fine-tune, and continue pre-training the Mixtral Mixture of Experts (MoE) model using AWS SageMaker HyperPod and SLURM in this comprehensive webinar series. Begin with an introduction to the Mixtral MoE model architecture and SLURM overview presented by AWS experts Chris Fregly and Antje Barth. Dive deep into training the Mixtral MoE foundation model on SLURM with SageMaker HyperPod through a detailed walkthrough by AWS Applied Scientist Ben Snyder. Explore advanced techniques for instruction fine-tuning and continued pre-training with practical demonstrations from Antje Barth and Chris Fregly. Access additional resources including the O'Reilly book on Generative AI on AWS, related GitHub repositories, and community platforms to further enhance your understanding of implementing large language models on AWS infrastructure.