Advanced Computer Vision and Deep Learning
via Udacity
-
457
-
- Write review
Overview
Learn to apply deep learning architectures to computer vision tasks. Discover how to combine CNN and RNN networks to build an automatic image captioning application.
Syllabus
- Advanced CNN Architectures
- Learn about advances in CNN architectures and see how region-based CNN’s, like Faster R-CNN, have allowed for fast, localized object recognition in images.
- YOLO
- Learn about the YOLO (You Only Look Once) multi-object detection model and work with a YOLO implementation.
- RNN's
- Explore how memory can be incorporated into a deep learning model using recurrent neural networks (RNNs). Learn how RNNs can learn from and generate ordered sequences of data.
- Long Short-Term Memory Networks (LSTMs)
- Luis explains Long Short-Term Memory Networks (LSTM), and similar architectures which have the benefits of preserving long term memory.
- Hyperparameters
- Learn about a number of different hyperparameters that are used in defining and training deep learning models. We'll discuss starting values and intuitions for tuning each hyperparameter.
- Optional: Attention Mechanisms
- Attention is one of the most important recent innovations in deep learning. In this section, you'll learn how attention models work and go over a basic code implementation.
- Image Captioning
- Learn how to combine CNNs and RNNs to build a complex, automatic image captioning model.
- Project: Image Captioning
- Train a CNN-RNN model to predict captions for a given image. Your main task will be to implement an effective RNN decoder for a CNN encoder.
Taught by
Cezanne Camacho (nd891), Luis Serrano, Jay Alammar - nd892, Ortal Arel - nd101 and Kelvin Lwin