Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Generating Mock Data with Python - NumPy, Pandas, and Datetime Libraries

Keith Galli via YouTube

Overview

Learn how to generate realistic mock data for sales analysis using Python in this comprehensive tutorial video. Explore the power of NumPy, Pandas, and Datetime libraries to create a sales dataset from scratch. Master techniques for simulating product purchases, implementing normal and geometric distributions, generating realistic timestamps, and creating random addresses. Discover how to improve code efficiency, add multiple item correlations, and produce 12 months of data across separate CSV files. Gain practical skills in data manipulation and synthetic dataset creation for data science projects.

Syllabus

- Intro & Background Info
- What we're creating in this video!
- Start writing code generating a simple dataframe & csv
- Task: Making our data more realistic, selecting some products with higher probability than others
- Task: Generate 12 months worth of data in 12 csvs calendar library, f-strings
- Make some months have more purchases than others
- Normal distributions in NumPy
- Improving speed of our code making testing easier
- Task: Generate random addresses for our data
- Task: Generate order times for purchases datetime library overview
- Using timedelta objects to add & subtract time from dates
- Generate a realistic quantity ordered for each product using numpy geometric distribution
- Add multiple items being more likely to be sold together and cleaning code a bit

Taught by

Keith Galli

Reviews

Start your review of Generating Mock Data with Python - NumPy, Pandas, and Datetime Libraries

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.