BigScience BLOOM - 3D Parallelism Explained - Large Language Models - ML Coding Series
Aleksa Gordić - The AI Epiphany via YouTube
Overview
Syllabus
Intro - focusing on the 3D parallelism!
Quick setup
Stepping through the eval script
3D paralellism - model construction
Sharding the embedding table model parallelism
Sharding the transformer layer
LayerNorm fused kernels
Sharding the attention layer
ColumnParallel and RowParallel sharding
Synchronizing input and output embedding tables
Building the dataset data parallelism
3D parallelism - forward pass
Pipeline parallelism communication
Pass through the sharded embedding table
Pass through the sharded transformer layer
Sharded logit and cross-entropy computation
Recap
Outro
Taught by
Aleksa Gordić - The AI Epiphany