Building Generalist Robotics Policies from Scratch

Overview

Dive into a comprehensive video tutorial on building Generalist Robotics Policies from scratch. Learn how to implement the "Octo: An Open-Source Generalist Robot Policy" model step-by-step, starting with basic transformer code and progressing to training the model using data from the open-x embodiment dataset. Explore topics such as data exploration, dataset creation, transformer encoder implementation, image patch tokenization, and Vision Transformer (ViT) construction. Discover techniques for incorporating text inputs, handling continuous and discrete actions, standardizing state inputs and action spaces, and integrating goal images into the transformer architecture. Gain insights into scaling training processes, analyzing results across A100 GPUs, and evaluating the model using the SimpleEnv robotics simulator. Access accompanying code, project details, and additional resources to enhance your understanding of Generalist Robotics Policies and their applications in the field of robotics.

Syllabus

Intro: ChatGPT, Language Models and the Goals of Generalist Robotics Policies
Reading and exploring the data
Creating a Dataset
Creating a Dataset
Creating the transformer encoder
Creating image patches to tokenized
Putting together the VIT
Training the VIT
Making the GRP, starting with adding text inputs
Modifying the data for training
Converting continuous actions to discrete bins
Converting continuous actions to discrete bins
Standardizing the state inputs
Changing to use continuous actions
Standizing the action space
Adding goal images to the transformer
Adding blocked masked attention to use either goal
Scaling training
Training results across A100s
Evaluation using the SimpleEnv robotics simulator

Taught by

Montreal Robotics

Reviews

Start your review of Building Generalist Robotics Policies from Scratch

Taught by

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.