Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Understanding Robotics Transformer 2 (RT-2) - A Deep Dive into DeepMind's Vision-Language-Action Model

AI Bites via YouTube

Overview

Explore a comprehensive analysis of DeepMind's Robotics Transformer 2 (RT-2) in this 16-minute video that delves into how language model reasoning capabilities can be integrated into robotics. Learn about the technical architecture and implementation details through an in-depth examination of the RT-2 paper, including discussions of the underlying PALM, PALM-E, PALI, and PALI-X models. Discover how RT-2 translates vision and language into robotic actions, understand its co-fine tuning process, and review the evaluation metrics used to assess its performance. The video is presented by an experienced Machine Learning Researcher with 15 years of software engineering background and advanced education in Computer Vision and Robotics, providing expert insights into this cutting-edge development in robotics and artificial intelligence.

Syllabus

Intro
What is RT2
Follow us
Models
Actions
Robot Action
CoFine Tuning
Evaluation

Taught by

AI Bites

Reviews

Start your review of Understanding Robotics Transformer 2 (RT-2) - A Deep Dive into DeepMind's Vision-Language-Action Model

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.