What you'll learn:
- Databricks Clusters, Notebooks, data storage
- Databricks Lakehouse Platform (architecture, descriptions, benefits)
- Delta Lake
- ELT with Spark SQL and Python
- Relational entities (databases, tables, views)
- Accessing Data from Azure Data Lake Storage (ADLS)
- Structured Streaming, Auto Loader
- Delta Live Tables, Multi-hop Architecture
- Databricks Jobs
- Databricks Dashboards
- Data Governance
Welcome to our comprehensive course on Databricks Certified Data Engineer Associate certification. This course is designed to help you master the skills required to become a certified Databricks data engineer associate.
Databricks is a cloud-based data analytics platform that offers a unified approach to data processing, machine learning, and analytics. With the growing demand for data engineers, Databricks has become one of the most sought-after skills in the industry.
In this course, you'll learn the core concepts of Databricks, including Databricks Lakehouse Platform, ELT with Spark SQL and Python, Incremental Data Processing, Production Pipelines, and Data Governance.
This course is designed by industry experts with years of experience in Databricks and data engineering. This course has theoretical concepts and hands-on labs to help you apply the concepts learned in the course.
Upon completion of the course, you'll be able to take the Databricks Certified Data Engineer Associate exam with confidence and succeed in your career as a data engineer.
At the end of this course you should be able to:
Understand how to use and the benefits of using the Databricks Lakehouse Platform and its tools, including:
Data Lakehouse (architecture, descriptions, benefits)
Data Science and Engineering workspace (clusters, notebooks, data storage)
Delta Lake (general concepts, table management, manipulation, optimizations)
Build ETL pipelines using Apache Spark SQL and Python, including:
Relational entities (databases, tables, views)
ELT (creating tables, writing data to tables, cleaning data, combining and reshaping tables, SQL UDFs)
Python (facilitating Spark SQL with string manipulation and control flow, passing data between PySpark and Spark SQL)
Incrementally process data, including:
Structured Streaming (general concepts, triggers, watermarks)
Auto Loader (streaming reads)
Multi-hop Architecture (bronze-silver-gold, streaming applications)
Delta Live Tables (benefits and features)
Build production pipelines for data engineering applications and Databricks SQL queries and dashboards, including:
Jobs (scheduling, task orchestration, UI)
Dashboards (endpoints, scheduling, alerting, refreshing)
Understand and follow best security practices, including:
Unity Catalog (benefits and features)
Entity Permissions (team-based permissions, user-based permissions)
Enroll now and take the first step towards becoming a certified Databricks data engineer associate.