The Spelled-Out Intro to Language Modeling - Building Makemore

The Spelled-Out Intro to Language Modeling - Building Makemore

Andrej Karpathy via YouTube Direct link

creating the bigram dataset for the neural net

13 of 24

13 of 24

creating the bigram dataset for the neural net

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

The Spelled-Out Intro to Language Modeling - Building Makemore

Automatically move to the next video in the Classroom when playback concludes

  1. 1 intro
  2. 2 reading and exploring the dataset
  3. 3 exploring the bigrams in the dataset
  4. 4 counting bigrams in a python dictionary
  5. 5 counting bigrams in a 2D torch tensor "training the model"
  6. 6 visualizing the bigram tensor
  7. 7 deleting spurious S and E tokens in favor of a single . token
  8. 8 sampling from the model
  9. 9 efficiency! vectorized normalization of the rows, tensor broadcasting
  10. 10 loss function the negative log likelihood of the data under our model
  11. 11 model smoothing with fake counts
  12. 12 PART 2: the neural network approach: intro
  13. 13 creating the bigram dataset for the neural net
  14. 14 feeding integers into neural nets? one-hot encodings
  15. 15 the "neural net": one linear layer of neurons implemented with matrix multiplication
  16. 16 transforming neural net outputs into probabilities: the softmax
  17. 17 summary, preview to next steps, reference to micrograd
  18. 18 vectorized loss
  19. 19 backward and update, in PyTorch
  20. 20 putting everything together
  21. 21 note 1: one-hot encoding really just selects a row of the next Linear layer's weight matrix
  22. 22 note 2: model smoothing as regularization loss
  23. 23 sampling from the neural net
  24. 24 conclusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.