Completed
[] Join us at our first in-person conference on June 25 all about AI Quality
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Handling Multi-Terabyte LLM Checkpoints - MLOps Podcast #228
Automatically move to the next video in the Classroom when playback concludes
- 1 [] Simon preferred beverage
- 2 [] Takeaways
- 3 [] Simon's tech background
- 4 [] Zombie models garbage collection
- 5 [] The road to LLMs
- 6 [] Trained models Simon worked on
- 7 [] LLM Checkpoints
- 8 [] Confidence in AI Training
- 9 [] Different Checkpoints
- 10 [] Checkpoint parts
- 11 [] Slurm vs Kubernetes
- 12 [] Storage choices lessons
- 13 [] Paramount components for setup
- 14 [] Argo workflows
- 15 [] Kubernetes node troubleshooting
- 16 [] Cloud virtual machines have pre-installed mentoring
- 17 [] Fine-tuning
- 18 [] Storage, networking, and complexity in network design
- 19 [] Start simple before advanced; consider model needs.
- 20 [] Join us at our first in-person conference on June 25 all about AI Quality