Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Solving Real World Data Science Tasks With Python Pandas

Keith Galli via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into a comprehensive Python Pandas tutorial that tackles real-world data science tasks using sales data from an electronics store. Learn to clean, explore, and analyze data through practical examples, including merging CSV files, adding columns, and answering business questions. Master essential Pandas and Matplotlib methods like concatenation, groupby operations, and data visualization. Gain hands-on experience in data manipulation, statistical analysis, and creating insightful graphs to extract valuable business insights from raw sales data.

Syllabus

- Intro
- Downloading the Data
- Getting started with the code Jupyter Notebook
Task #1: Merging 12 csvs into a single dataframe
- Read single CSV file
- List all files in a directory
- Concatenating files
- Reading in Updated dataframe
Task #2: Add a Month column
- Parse string in Pandas cell .str
- Drop NaN values from df
- Remove rows based on condition
Task #3: Add a sales column
- Another way to convert a column to numeric ints & floats
Question #1: What was the best month for sales?
- Visualizing our results with bar chart in matplotlib
Question #2: What city sold the most product?
- Add a city column
- Using the .apply method super useful!!
- Why do we use the lambda x ?
- Dropping a column
- Answering the question using groupby
- Plotting our results
Question #3: What time should we display advertisements to maximize the likelihood of purchases?
- Using to_datetime method
- Creating hour & minute columns
- Matplotlib line graph to plot our results
- Interpreting our results
Question #4: What products are most often sold together?
- Finding duplicate values in our DataFrame
- Use transform method to join values from two rows into a single row
- Dropping rows with duplicate values
- Counting pairs of products itertools, collections
Question #5: What product sold the most? Why do you think it did?
- Graphing data
- Overlaying a second Y-axis on existing chart
- Interpreting our results

Taught by

Keith Galli

Reviews

Start your review of Solving Real World Data Science Tasks With Python Pandas

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.