Overview
Discover how to combat pipeline debt in data systems using the Great Expectations open source library in this 39-minute talk from the Toronto Machine Learning Series. Learn about the challenges data teams face with untested assumptions and outdated documentation in complex data pipelines. Explore patterns for implementing effective data validation and documentation practices to improve productivity, build trust in data, and boost team morale. Gain insights into leveraging Expectations, a concept for asserting data properties, to validate data, generate human-readable documentation, and profile sample data. Understand how Great Expectations provides flexible tools to address common issues in data pipeline management and maintenance.
Syllabus
Abe Gong - Fighting Pipeline Debt With Great Expectations
Taught by
Toronto Machine Learning Series (TMLS)