Learn to clean and preprocess the diamonds dataset, including converting categorical features to ordered types and visualizing data distributions. Gain essential skills in data preparation and visualization techniques, ensuring a solid foundation for deeper data analysis and modeling tasks.
Overview
Syllabus
- Lesson 1: Basic Data Cleaning with the Diamonds Dataset
- Change Missing Value Column
- Common Mistake: Fixing Null Values
- Handling Missing Values Efficiently
- Handling Missing Values in Diamonds
- Write from Scratch: Handle Missing Values
- Lesson 2: Converting Categorical Data to Ordered Types in Python
- Change Clarity Category Order
- Fix Conversion of Categorical Data
- Convert Data Types for Diamonds
- Categorical Types Conversion Task
- Convert Categorical Data to Ordered
- Lesson 3: Visualizing Diamond Price Distribution
- Adjust the Histogram Parameters
- Fix Histogram Visualization Errors
- Customize the Histogram
- Complete the Histogram Visualization
- Visualize Diamond Price Distribution
- Lesson 4: Detecting and Handling Outliers in the Diamonds Dataset
- Handle Outliers in Carat Column
- Fix Issues in Outlier Detection
- Detect and Remove Outliers
- Flagging Outliers in the Diamonds Dataset
- Outliers Handling from Scratch
- Lesson 5: Standardizing Numerical Features in the Diamonds Dataset
- Selective Standardization Exercise
- Fix Bugs in Standardization Code
- Standardize Numerical Features Practice
- Standardize Specific Features with MinMax
- Standardizing Numerical Features from Scratch