Overview
Syllabus
- What we will be doing!
- Sci-Kit Learn Overview
- How do we find training data?
- Download data
- Load our data into Jupyter Notebook
- Cleaning our code a bit building data class
- Using Enums
- Converting text to numerical vectors, bag of words BOW explanation
- Training/Test Split make sure to "pip install sklearn" !
- Bag of words in sklearn CountVectorizer
- fit_transform, fit, transform methods
- Model Selection SVM, Decision Tree, Naive Bayes, Logistic Regression & Classification
- predict method
- Analysis & Evaluation using clf.score method
- F1 score
- Improving our model evenly distributing positive & negative examples and loading in more data
- Let's see our model in action! qualitative testing
- Tfidf Vectorizer
- GridSearchCv to automatically find the best parameters
- Further NLP improvement opportunities
- Saving our model Pickle and reloading it later
- Category Classifier
- Confusion Matrix
Taught by
Keith Galli