Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Data Prep for Machine Learning in Python

Corporate Finance Institute via Coursera

Overview

Machine learning models rely on good data to produce meaningful insights. For that reason, data prep is one of the most critical skills for machine learning. In this course, you’ll learn how to import and clean data before populating missing values using imputation. You’ll learn how to visualize histograms, scatter charts, and box plots to identify trends of interest before using the analysis to select the most important features. Feature engineering techniques such as one hot encoding, binning and scaling will help us transform the structure of our data to produce higher quality machine learning insights. This data prep course in Python includes more interactive exercises and challenges than previous BIDA courses have. You will also have the opportunity to test your skills on a comprehensive guided Python case study before completing the final exam. Upon completing this course, you will be able to: • Import and clean your data in Python • Apply imputation to estimate missing values in the dataset • Conduct exploratory data analysis (EDA) to find initial patterns to guide our analysis • Select features to focus on the most important variables • Apply feature engineering to make datasets machine learning-friendly • Select appropriate feature engineering techniques based on the model type Whether you are a business leader or an aspiring analyst exploring data science, this Data Prep for Machine Learning in Python course will serve as your comprehensive introduction to this fascinating subject. You’ll learn all the key terminology to allow you to talk data science with your teams, begin implementing analysis, and understand how data science can help your business.

Syllabus

  • Introduction to Data Prep
    • In this course, we’ll learn how to import and clean data before populating missing values using imputation. We’ll learn how to visualize histograms, scatter charts, and box plots to identify trends of interest before using the analysis to select the most important features. Feature engineering techniques such as one hot encoding, binning and scaling will help us transform the structure of our data to produce higher quality machine learning insights.
  • Importing & Cleaning Data
  • Exploratory Data Analysis
  • Train-Test Split (Recap)
  • Week 1 Challenge
  • Feature Engineering Part 1 - Encoding & Transformation
  • Feature Engineering Part 2 - Outliers, Binning, and Scaling
  • Feature Selection
  • Course Conclusion
  • Week 2 Challenge

Taught by

CFI (Corporate Finance Institute)

Reviews

Start your review of Data Prep for Machine Learning in Python

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.