Real-World Python Machine Learning Tutorial With Scikit Learn

Overview

Coursera Plus Flash Sale: All Certificates & Courses 40% Off. 72 Hours Only!

Grab it

Embark on a comprehensive Python machine learning tutorial that guides you through building a sentiment analysis model using scikit-learn. Learn to classify text as positive or negative using Amazon reviews as training data. Explore essential concepts including data preprocessing, text vectorization with Bag of Words and TF-IDF, model selection, and evaluation techniques. Master practical skills such as using CountVectorizer, implementing various classification algorithms, improving model performance, and leveraging GridSearchCV for hyperparameter tuning. Gain hands-on experience with NLP techniques, model saving and loading, and creating a category classifier. By the end of this tutorial, you'll have a solid foundation in applying machine learning to real-world text classification problems using Python and scikit-learn.

Syllabus

- What we will be doing!
- Sci-Kit Learn Overview
- How do we find training data?
- Download data
- Load our data into Jupyter Notebook
- Cleaning our code a bit building data class
- Using Enums
- Converting text to numerical vectors, bag of words BOW explanation
- Training/Test Split make sure to "pip install sklearn" !
- Bag of words in sklearn CountVectorizer
- fit_transform, fit, transform methods
- Model Selection SVM, Decision Tree, Naive Bayes, Logistic Regression & Classification
- predict method
- Analysis & Evaluation using clf.score method
- F1 score
- Improving our model evenly distributing positive & negative examples and loading in more data
- Let's see our model in action! qualitative testing
- Tfidf Vectorizer
- GridSearchCv to automatically find the best parameters
- Further NLP improvement opportunities
- Saving our model Pickle and reloading it later
- Category Classifier
- Confusion Matrix