Declarative ETL Pipelines with Delta Live Tables - Modern Software Engineering for Data Analysts and Engineers

Declarative ETL Pipelines with Delta Live Tables - Modern Software Engineering for Data Analysts and Engineers

SQLBits via YouTube Direct link

Intro

1 of 14

1 of 14

Intro

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Declarative ETL Pipelines with Delta Live Tables - Modern Software Engineering for Data Analysts and Engineers

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 What is a Streaming Live Table? Based on Spark™ Structured Streaming
  3. 3 Development vs Production Fast iteration or enterprise grade reliability
  4. 4 Choosing pipeline boundaries Break up pipelines at natural external divisions.
  5. 5 Pitfall: hard-code sources & destinations Problem: Hard coding the source & destination makes it impossible to test changes outside of production, breaking CI/CD
  6. 6 Ensure correctness with Expectations Expectations are tests that ensure data quality in production
  7. 7 Expectations using the power of SQL Use SQL aggregates and joins to perform complex validations
  8. 8 Using Python Write advanced DataFrame code and UDFs
  9. 9 Installing libraries with pip pip is a package installer for python
  10. 10 Best Practice: Integrate using the event log Use the information in the event log with your existing operational tools.
  11. 11 DLT Automates Failure Recovery Transient issues are handled by built-in retry logic
  12. 12 Modularize your code with configuration Avoid hard coding paths, topic names, and other constants in your code.
  13. 13 Workflow Orchestration For Triggered DLT Pipelines
  14. 14 Use Delta for infinite retention Delta provides cheap, elastic and governable storage for transient sources

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.