Overview
Syllabus
[] Introduction to Crag Wolfe
[] Agenda
[] Unstructured.io introduction
[] Then open-source community
[] The goal
[] Rapidly build custom preprocessing API
[] Staging
[] Demo
[] Developer quick start
[] SEC Filing Section Pipeline
[] Section 1: Pulling in Raw Documents
[] Section 2: Reading the Document
[] Section 3: Custom Partitioning Bricks
[] Section 4: Cleaning Bricks
[] Section 5: Staging Bricks
[] Section 6: Define the Pipeline API
[] SEC Sentiment Analysis Model notebook
[] Stage for transformers
[] Training a summarization model with Unstructured + Argilla + Huggingface
[] Crag's previous engineering experience
[] Deciding what to tackle next
[] Editing documents
[] Scaling issues
[] Moving out of NLP
[] Wrap up
Taught by
MLOps.community