Learn how to build and test data engineering pipelines in Python using PySpark and Apache Airflow.
In any data-driven company, you will undoubtedly cross paths with data engineers. Among other things, they facilitate some of your work by making data readily available to everyone within the organization, and possibly in bringing machine learning models into production. One way to speed up this process is through building an understanding of what it means to bring processes into production and what features are of high-grade code. In this course, we’ll be looking at various data pipelines the data engineer is building, and how some of the tools he or she is using can help you in getting your models into production or run repetitive tasks consistently and efficiently.
In this course, we illustrate common elements of data engineering pipelines. In Chapter 1, you will learn how to ingest data. Chapter 2 will go one step further with cleaning and transforming data. In Chapter 3, you will learn how to safely deploy code. Finally, in Chapter 4 you will schedule complex dependencies between applications.
Building Data Engineering Pipelines covers new technologies and material, so we recommend that you have a strong understanding of the prerequisites to get the most out of this course.
In any data-driven company, you will undoubtedly cross paths with data engineers. Among other things, they facilitate some of your work by making data readily available to everyone within the organization, and possibly in bringing machine learning models into production. One way to speed up this process is through building an understanding of what it means to bring processes into production and what features are of high-grade code. In this course, we’ll be looking at various data pipelines the data engineer is building, and how some of the tools he or she is using can help you in getting your models into production or run repetitive tasks consistently and efficiently.
In this course, we illustrate common elements of data engineering pipelines. In Chapter 1, you will learn how to ingest data. Chapter 2 will go one step further with cleaning and transforming data. In Chapter 3, you will learn how to safely deploy code. Finally, in Chapter 4 you will schedule complex dependencies between applications.
Building Data Engineering Pipelines covers new technologies and material, so we recommend that you have a strong understanding of the prerequisites to get the most out of this course.