Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Gradient Boost Part 2 - Regression Details

StatQuest with Josh Starmer via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into the second part of a four-part video series on Gradient Boost, focusing on regression details. Learn how this popular machine learning algorithm predicts continuous values like weight. Explore the original Gradient Boost algorithm step-by-step, including data and loss function initialization, model initialization with a constant value, and the process of building multiple trees. Understand how to calculate residuals, fit regression trees to residuals, optimize leaf output values, and update predictions with new trees. Gain insights into the final prediction output and benefit from detailed explanations of each step in the algorithm. Perfect for those who have watched Part 1 and are familiar with Regression Trees and Gradient Descent concepts.

Syllabus

Awesome song and introduction
Step 0: The data and the loss function
Step 1: Initialize the model with a constant value
Step 2: Build M trees
Step 2.A: Calculate residuals
Step 2.B: Fit a regression tree to the residuals
Step 2.C: Optimize leaf output values
Step 2.D: Update predictions with the new tree
Step 2: Summary of step 2
Step 3: Output the final prediction
The sum on the left hand side should be in parentheses to make it clear that the entire sum is multiplied by 1/2, not just the first term.
. It should be R_jm, not R_ij.
, the leaf in the script is R_1,2 and it should be R_2,1.
. With regression trees, the sample will only go to a single leaf, and this summation simply isolates the one output value of interest from all of the others. However, when I first made this video I was thinking that because Gradient Boost is supposed to work with any "weak learner", not just small regression trees, that this summation was a way to add flexibility to the algorithm.
, the header for the residual column should be r_i,2.

Taught by

StatQuest with Josh Starmer

Reviews

Start your review of Gradient Boost Part 2 - Regression Details

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.