Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks

Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks

USC Information Sciences Institute via YouTube Direct link

Summary

18 of 19

18 of 19

Summary

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Single-Task Model vs. Unified Model
  3. 3 Single-Task Model for Vision
  4. 4 Image Output Quantization
  5. 5 Text Input for Different Tasks
  6. 6 Model Details
  7. 7 Objective
  8. 8 Dataset and Implementations
  9. 9 Pre-training Distribution
  10. 10 Evaluation
  11. 11 GRIT requires diverse skills
  12. 12 Results
  13. 13 Semantic Segmentation
  14. 14 Depth Estimation
  15. 15 Object Detection
  16. 16 Image Inpainting
  17. 17 Segmentation based image generation
  18. 18 Summary
  19. 19 Tasks Distribution

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.