Overview
Explore Google's new multimodal AI model family, Gemini, in this comprehensive technical report video. Dive into the model architecture, dataset, and training process that have enabled Gemini Ultra to set new records in 30 out of 32 benchmarks. Learn how Gemini advances state-of-the-art large-scale language modeling, image and audio processing, and video understanding. Examine the performance overview, training and inference hardware using TPUs, and the detailed evaluation process. Gain insights into the long context effectiveness and understand how Gemini marks a milestone in human-expert performance on the MMLU benchmark for knowledge and reasoning.
Syllabus
- Intro
- AI Bootcamp on MLExpert.io
- Google Gemini Blog Post
- Performance Overview
- Training & Inference Hardware TPUs
- Technical Report
- Model Architecture
- Training Infra
- Training Dataset
- Detailed Evaluation
- Long Context Effectiveness
- Conclusion
Taught by
Venelin Valkov