LangChain Data Loaders, Tokenizers, Chunking, and Datasets - Data Prep

LangChain Data Loaders, Tokenizers, Chunking, and Datasets - Data Prep

James Briggs via YouTube Direct link

Why we use chunk overlap

7 of 11

7 of 11

Why we use chunk overlap

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

LangChain Data Loaders, Tokenizers, Chunking, and Datasets - Data Prep

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Data preparation for LLMs
  2. 2 Downloading the LangChain docs
  3. 3 Using LangChain document loaders
  4. 4 How much text can we fit in LLMs?
  5. 5 Using tiktoken tokenizer to find length of text
  6. 6 Initializing the recursive text splitter in Langchain
  7. 7 Why we use chunk overlap
  8. 8 Chunking with RecursiveCharacterTextSplitter
  9. 9 Creating the dataset
  10. 10 Saving and loading with JSONL file
  11. 11 Data prep is important

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.