Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

edX

Cloud Data Engineering

Pragmatic AI Labs via edX

Overview

  • Discover the principles of data engineering and its role in building scalable, cloud-based systems.
  • Explore the challenges of the end of Moore's Law and learn to develop distributed systems.
  • Gain hands-on experience with big data technologies and best practices for implementing solutions.
  • Learn to build serverless data engineering pipelines and apply effective data governance strategies.
  • Develop expertise in key data engineering tasks, including ETL, cloud databases, and cloud storage.

Syllabus

Here is the course structure formatted with bullets for each module:

1. Module 1: Methodologies in Data Engineering (12 hours)

- Videos:

- Introduction and Course Overview (4 minutes)

- The End of Moore's Law and Concurrency in Python (7 minutes)

- Using CUDA, Numba, and ASICs (13 minutes)

- Exploring Colab Pro and Colab AI (9 minutes)

- Distributed Systems Concepts (9 minutes)

- Debugging Python Code (25 minutes)

- Exploring Google BigQuery (12 minutes)

- Introduction to Big Data and Data Lakes (4 minutes)

- Big Data Processing (3 minutes)

- AWS Data Engineering Design Principles (20 minutes)

- Processing Big Data with AWS (25 minutes)

- Transform Data with Databricks Spark SQL (5 minutes)

- Readings (22 readings, 220 minutes)

- Quizzes (5 quizzes, 150 minutes)

- Discussion Prompts (4 discussion prompts, 40 minutes)

- Ungraded Labs (3 ungraded labs, 180 minutes)

2. Module 2: Principles of Data Engineering (11 hours)

- Videos:

- Introduction to Data Engineering (1 minute)

- Data Driven Organizations (19 minutes)

- Batch vs. Streaming vs. Events (1 minute)

- Ingesting by Batch or Stream (20 minutes)

- Building CLI Tools with Click (33 minutes)

- Building Containerized Command-line Tools (12 minutes)

- Rust and Python (5 minutes)

- Python Calculator CLI and Caesar Cipher CLI (7 minutes)

- Advanced Testing with Amazon CodeGuru and AWS CodeBuild (44 minutes)

- Mapping Functions to CLI (58 minutes)

- AWS CodeWhisperer CLI and SDK (7 minutes)

- Readings (10 readings, 100 minutes)

- Quizzes (4 quizzes, 120 minutes)

- Discussion Prompts (3 discussion prompts, 30 minutes)

- Ungraded Labs (4 ungraded labs, 240 minutes)

3. Module 3: Building Data Engineering Pipelines (6 hours)

- Videos:

- Introduction to Serverless Data Engineering (0 minutes)

- Automating Pipelines (21 minutes)

- Serverless Concepts (17 minutes)

- AWS Lambda (42 minutes)

- Build a Serverless Data Pipeline (37 minutes)

- Serverless Cookbook with AWS and GCP (49 minutes)

- Introduction to Data Governance (0 minutes)

- The Principle of Least Privilege (1 minute)

- Cloud Security with IAM on AWS (30 minutes)

- Encrypt at Rest and Transit (3 minutes)

- Readings (7 readings, 70 minutes)

- Quizzes (3 quizzes, 90 minutes)

- Discussion Prompts (2 discussion prompts, 20 minutes)

4. Module 4: Applying Key Data Engineering Tasks (10 hours)

- Videos:

- Introduction to Extract, Transform, Load (ETL) (0 minutes)

- Ingesting and Preparing Data on AWS (19 minutes)

- Using Amazon Athena with AWS Glue (22 minutes)

- Real-World Problems in ETL (13 minutes)

- Introduction to Cloud Databases (6 minutes)

- MySQL Overview and Usage (28 minutes)

- Big Query with Prompt Engineering and Colab Pipeline (14 minutes)

- Introduction to Cloud Storage (0 minutes)

- Cloud Storage Deep Dive (13 minutes)

- Using Amazon S3 (4 minutes)

- Readings (10 readings, 100 minutes)

- Quizzes (4 quizzes, 120 minutes)

- Discussion Prompts (3 discussion prompts, 30 minutes)

- Ungraded Labs (4 ungraded labs, 240 minutes)

Taught by

Noah Gift

Reviews

Start your review of Cloud Data Engineering

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.