Overview
Explore Meta's newly released Llama3 family of language models in this comprehensive video analysis. Delve into the key features of the 8B and 30B model variants, including their expanded tokenizer and Grouped-Query Attention (GQA) architecture. Learn about the extensive training process involving 7.7 million GPU hours and 15 trillion tokens. Compare Llama3's performance against established models like Mistral and Gemini through benchmark results and practical demonstrations. Gain insights into Meta's motivations for open-sourcing the model and its potential impact on the AI ecosystem. Watch as the presenter puts the 8B model through a series of tests, including list generation, JSON formatting, environmental state analysis, and code debugging. Discover how Llama3 handles various tasks and assess its capabilities in real-world scenarios.
Syllabus
- Intro
- The Announcement
- Benchmarks
- Humaneval
- Model Architecture
- Training data
- 400B model?
- LLama3 in the ecosystem
- Details from Meta Researcher
- Why did they open source it?
- Trying out Llama3
- List 1-10
- JSON Recipe
- Environment State
- Dealing with incomplete info
- Robbing a bank
- Debugging code pt.1
- Debugging code pt.2
- Conclusion
Taught by
Decoder