Embarking on this course allows you to deeply understand and apply data cleaning and preprocessing techniques. It systematically covers the concepts of data cleaning, handling missing values, normalization, binning, encoding, and more, aiming to equip you with practical skills for preparing data for analysis or machine learning tasks.
Overview
Syllabus
- Lesson 1: Understanding and Handling Missing Values in Datasets with Python
- Applying Missing Values Handling to Clients' Personal Information Dataset
- Replacing Mean with Median
- Filling the Missing Values in the Clients Dataset
- Addressing Missing Values in Client Data
- Lesson 2: Handling Duplicates and Outliers in Datasets
- Identifying Duplicates and Outliers in Height Dataset
- Cleaning Duplicates from the Dataset
- Cleaning Up School Data: Handling Duplicates and Outliers
- Removing Duplicates and Handling Outliers in Student Data
- Clean School Data: Handling Duplicates and Outliers
- Lesson 3: Understanding and Implementing Data Normalization Techniques in Python
- Exploring Normalization of Planet Orbit Speeds
- Normalization on Planet Diameter
- Planetary Orbital Speed Normalization Fix
- Applying Min-Max Normalization to Planetary Distances
- Normalization of Planetary Orbits
- Lesson 4: Categorical Data Encoding Techniques in Python: An Introduction to Label and One-Hot Encoding
- Encoding Clothing Categories and Sizes
- Changing One-Hot Encoding to Label Encoding for Clothing Type Data
- Fix the Clothing Store Inventory Management System
- Mapping Clothing Sizes to Numerical Values
- Applying Categorical Data Encoding in Clothing Store Inventory Management
- Lesson 5: Data Binning Techniques: An Introduction and Implementation with Python and Pandas
- Binning Student Ages into Grade Levels
- Altering Labels in Data Binning
- Categorizing Student Ages with Data Binning
- Implementing Binning Technique in Data Preparation
- Categorizing Ages into Groups