Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CodeSignal

Data Cleaning and Preprocessing in Machine Learning

via CodeSignal

Overview

Explore essential machine learning preparation using the Titanic Dataset. Gain skills in cleaning and preprocessing historical data with Python and Pandas, readying it for ML models and accurate analytics.

Syllabus

  • Lesson 1: Data Preprocessing: The Titanic Dataset Exploration
    • Data Preprocessing with the Titanic Dataset
    • Adjust Filtering to Age and Fare
    • Debug the Titanic Dataset Loading Code
  • Lesson 2: Wrangling Missing Data: Techniques Applied to the Titanic Dataset
    • Handle Missing Data in the Titanic Dataset
    • Update Titanic Dataset Handling Missing Data Code
    • Something is missing
    • Data Cleaning in Titanic Dataset
  • Lesson 3: Outlier Detection and Handling in the Titanic Dataset
    • Should we change the threshold?
    • Detecting Outliers in Titanic Dataset Using Standard Diviation
    • Detecting Outliers in Titanic Dataset Using IQR method
    • Identifying and Handling Outliers using the IQR Method
  • Lesson 4: Data Transformation with the Titanic Dataset
    • Applying MinMaxScaler to Multiple Features
    • Applying One-Hot Encoding to Categorical Features
    • Applying MinMaxScaler and One-Hot Encoding To Features
  • Lesson 5: Data Preprocessing: Mastering Normalization and Standardization Techniques
    • Normalize the 'age' Column in the Titanic Dataset
    • Standardize the 'fare' Column with NaN values in the Titanic Dataset
    • Normalize and Standardize 'age' and 'fare' Columns with Missing Values in the Titanic Dataset
    • Standardize on your own
  • Lesson 6: Feature Engineering: Enhancing the Titanic Dataset for Survival Predictions
    • Implement Log Transformation on 'fare’ Feature
    • Implement Binary Encoding on 'embark_town' Feature
    • Implement Log Transformation on 'fare’ Feature
    • Implement One-Hot Encoding on 'class' Feature
  • Lesson 7: Training a Machine Learning Model with the Titanic Dataset
    • Preprocessing Train and Test data with the Titanic Dataset
    • Fix the Titanic Machine Learning Model
    • Evaluating the Titanic Machine Learning Model with a Different Metric
    • Understanding Feature Importance in the Titanic Logistic Regression Model

Reviews

Start your review of Data Cleaning and Preprocessing in Machine Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.