Byte Latent Transformers - Understanding Meta's BLT Model for Efficient Language Processing
Neural Breakdown with AVB via YouTube
Overview
Syllabus
- Intro
- Intro to Transformers
- Subword Tokenizers
- Embeddings
- How does vocab size impact Transformer FLOPs?
- Byte Encodings
- Pros and Cons of Byte Tokens
- Patches
- Entropy
- Entropy model
- Dynamically Allocate Compute
- Latent Space
- BLT Architecture
- Local Encoder
- Latent Transformer and Local Decoder in BLT
- Outro
Taught by
Neural Breakdown with AVB