Overview
Syllabus
Awesome song and introduction
Step 0: The data and the loss function
Step 1: Initialize the model with a constant value
Step 2: Build M trees
Step 2.A: Calculate residuals
Step 2.B: Fit a regression tree to the residuals
Step 2.C: Optimize leaf output values
Step 2.D: Update predictions with the new tree
Step 2: Summary of step 2
Step 3: Output the final prediction
The sum on the left hand side should be in parentheses to make it clear that the entire sum is multiplied by 1/2, not just the first term.
. It should be R_jm, not R_ij.
, the leaf in the script is R_1,2 and it should be R_2,1.
. With regression trees, the sample will only go to a single leaf, and this summation simply isolates the one output value of interest from all of the others. However, when I first made this video I was thinking that because Gradient Boost is supposed to work with any "weak learner", not just small regression trees, that this summation was a way to add flexibility to the algorithm.
, the header for the residual column should be r_i,2.
Taught by
StatQuest with Josh Starmer