Overview
Explore a 16-minute video examining the specialized Qwen 2.5 coder model, which uniquely combines reasoning capabilities with code generation to create advanced coding agents. Learn about its evolution from CodeQwen, architectural design, and comprehensive training approach that systematically blends various data types. Discover how this model compares to competitors like DeepSeek and CodeStral, understand its base model structure, and examine the specific training policies that contribute to its exceptional performance-to-size ratio. Delve into the model's instruction tuning process and its enhanced mathematical reasoning abilities, making it one of the most sophisticated code generation models available. Access detailed resources including the official blog, GitHub repository, and interactive demos to further explore this cutting-edge development in AI-powered coding assistance.
Syllabus
- Intro
- Comparison to DeepSeek and CodeStral
- From CodeQwen to Qwen Coder
- Qwen 2.5 Base Models
- Model Architecture
- Model Training Data
- Data Mixture
- Model Training Policy
- Instruction Tuned Models
- Best performance-to-size ratio
- Math reasoning
- Extro
Taught by
AI Bites