Learn the key facets of data engineering, from its place in the data science realm, to the specific tasks and skills every data engineer should possess.
Overview
Syllabus
Introduction
- What is data engineering?
- Introduction to data engineering
- Data engineer vs. data scientist
- Essential tools for data engineering
- Intro to databases and their types
- Understanding database schema
- Distributive computing
- MapReduce and Hadoop
- Hive
- Spark
- Airflow
- Sources of data extraction
- Data extraction from a PostgreSQL database
- Challenge: Data extraction
- Solution: Data extraction
- Transforming data
- Challenge: Transforming data
- Solution: Transforming data
- Loading data into a DB
- Challenge: Loading data
- Solution: Loading data
- Scheduling ETL pipeline using Airflow
- Next steps
Taught by
Harshit Tyagi