Completed
Capsules in multiple modalities
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Generalization to Video Capsules - From Convolutional to Video Capsule Networks
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Computational Cost of Capsule Voting
- 3 Conventional Convolutional Layers
- 4 Convolutional Capsule Layers
- 5 Capsule Pooling
- 6 Video Capsule Networks
- 7 Video Action Detection Networks
- 8 VideoCapsuleNet Architecture
- 9 Coordinate Addition
- 10 Capsule Masking
- 11 VideoCapsuleNet Training
- 12 Action Localization Accuracy
- 13 Qualitative Results - Entire Videos
- 14 Synthetic Dataset Experiments
- 15 Summary
- 16 Capsules in multiple modalities
- 17 Combining Video and Text
- 18 Overall Approach
- 19 Multi-modal Capsule Routing Algorithm
- 20 Full Architecture
- 21 Sentence Encoder
- 22 Merging Modalities and Masking
- 23 Upsampling Network
- 24 Quantitative Results - A2D Dataset
- 25 Semi-Supervised Video Object Segmentation
- 26 VOS using Capsules
- 27 Attention Routing
- 28 Video Encoder
- 29 Frame Encoder with Memory Module
- 30 Conv Capsule Layer and Decoder Network
- 31 Objective Function
- 32 Quantitative Results -Speed Analysis
- 33 Qualitative Results - Single Object
- 34 Qualitative Results - Multiple Objects
- 35 Effect of Memory Module
- 36 Effect of the Zooming Module
- 37 Effect of Zooming Module