Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to build an image captioning system from scratch in this 36-minute tutorial. Explore the Flickr8k dataset and implement a model using PyTorch. Gain insights into combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for image captioning tasks. Follow along with code implementation, training setup, and error fixing. Discover potential improvements like using larger models, extended training, and incorporating attention mechanisms. Conclude with a brief evaluation of the implemented system.
Syllabus
- Introduction
- Explanation of Image Captioning
- Overview of the code
- Implementation of CNN and RNN
- Setting up the training
- Fixing errors
- Small evaluation and ending
Taught by
Aladdin Persson