Building Makemore - Activations & Gradients, BatchNorm

Building Makemore - Activations & Gradients, BatchNorm

Andrej Karpathy via YouTube Direct link

the fully linear case of no non-linearities

13 of 17

13 of 17

the fully linear case of no non-linearities

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Building Makemore - Activations & Gradients, BatchNorm

Automatically move to the next video in the Classroom when playback concludes

  1. 1 intro
  2. 2 starter code
  3. 3 fixing the initial loss
  4. 4 fixing the saturated tanh
  5. 5 calculating the init scale: “Kaiming init”
  6. 6 batch normalization
  7. 7 batch normalization: summary
  8. 8 real example: resnet50 walkthrough
  9. 9 summary of the lecture
  10. 10 just kidding: part2: PyTorch-ifying the code
  11. 11 viz #1: forward pass activations statistics
  12. 12 viz #2: backward pass gradient statistics
  13. 13 the fully linear case of no non-linearities
  14. 14 viz #3: parameter activation and gradient statistics
  15. 15 viz #4: update:data ratio over time
  16. 16 bringing back batchnorm, looking at the visualizations
  17. 17 summary of the lecture for real this time

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.