Completed
[] Crag's previous engineering experience
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Let's Talk About Raw Documents - Extracting Structured Data for ML Pipelines
Automatically move to the next video in the Classroom when playback concludes
- 1 [] Introduction to Crag Wolfe
- 2 [] Agenda
- 3 [] Unstructured.io introduction
- 4 [] Then open-source community
- 5 [] The goal
- 6 [] Rapidly build custom preprocessing API
- 7 [] Staging
- 8 [] Demo
- 9 [] Developer quick start
- 10 [] SEC Filing Section Pipeline
- 11 [] Section 1: Pulling in Raw Documents
- 12 [] Section 2: Reading the Document
- 13 [] Section 3: Custom Partitioning Bricks
- 14 [] Section 4: Cleaning Bricks
- 15 [] Section 5: Staging Bricks
- 16 [] Section 6: Define the Pipeline API
- 17 [] SEC Sentiment Analysis Model notebook
- 18 [] Stage for transformers
- 19 [] Training a summarization model with Unstructured + Argilla + Huggingface
- 20 [] Crag's previous engineering experience
- 21 [] Deciding what to tackle next
- 22 [] Editing documents
- 23 [] Scaling issues
- 24 [] Moving out of NLP
- 25 [] Wrap up