Self-Supervised Learning in Computer Vision

Overview

Explore the world of self-supervised learning (SSL) in computer vision through this comprehensive lecture. Delve into the motivation behind SSL, its definition, and applications in NLP and computer vision. Understand the concept of pretext tasks and their role in SSL, examining examples in images, videos, and videos with sound. Gain insights into the representations learned by pretext tasks and their shortcomings. Discover the characteristics of good pretrained features and learn how to achieve them using clustering and contrastive learning techniques. Investigate the ClusterFit method, its steps, and performance. Dive deep into PIRL, a simple framework for contrastive learning, understanding its working principles and evaluation in various contexts. This two-hour lecture, part of a larger course, provides a thorough exploration of SSL in computer vision, offering valuable knowledge for researchers and practitioners in the field.

Syllabus

– Week 10 – Lecture
– Challenges of supervised learning and how self supervised learning differs from supervised and unsupervised, with examples in NLP and Relative positions for vision
– Examples of pretext tasks in images, videos and videos with sound
– Understanding what the "pretext" task learns
– Generalization of pretext task and ClusterFit
– Basic idea of PIRL
– Evaluating PIRL on different tasks and questions