Learn how to overcome data pipeline development challenges when real data is unavailable through this 31-minute conference talk from PyCon South Africa. Discover practical techniques for generating and utilizing synthetic data with Python, including statistical methods and packages like Faker and SDV to create realistic test data for customer profiles, transactions, and time series. Explore how to implement Flyway for loading synthetic data into Postgres databases and managing repeatable deployments. Gain valuable insights into best practices, benefits, and potential challenges of synthetic data testing through code examples and live demonstrations. Designed for intermediate Python developers, master the essential skills needed to build and validate robust data pipelines without requiring access to actual production data.
How to Build a Data Pipeline Using Synthetic Data Generation and Testing with Python
PyCon South Africa via YouTube
Overview
Syllabus
Time: Oct 05 Thu:
Duration:
Taught by
PyCon South Africa