Overview
Class Central Tips
This is a technical, hands-on course that teaches learners how to build modern and continuous data pipelines with Snowflake. It focuses specifically on the most practical Snowflake concepts and tools to get learners up and running quickly with building data pipelines.
Learners start by learning about the "Ingestion-Transformation-Delivery" framework for modern data engineering, and dive deeper into each component of the framework by learning how to:
- Ingest data into Snowflake at scale using a variety of powerful techniques
- Perform data transformations with SQL or Snowpark
- Extend data transformations with user-defined functions, stored procedures, streams, and Snowflake Dynamic Tables
- Deliver valuable data products through Snowflake Marketplace, Streamlit in Snowflake, and Snowflake Native Applications
- Orchestrate pipelines using tasks and DAGs
Throughout the course, learners follow along with the instructor using a combination of Snowflake, Visual Studio Code, GitHub, and the command line. The course is supplemented with readings containing plenty of resources to level up the learner's understanding of specific concepts.
Learners come away understanding how to build end-to-end, continuous data pipelines with Snowflake.
Syllabus
- Modern data engineering with Snowflake
- Learners understand how the explosion of data in recent years has led to an increased demand for extracting insights from that data, giving rise to data engineering. They'll understand previous data engineering approaches, modern data engineering approaches using Snowflake, and contextualize data engineering with the Ingestion-Transformation-Delivery ("ITD") framework. They also prepare their development environment and build a simple data pipeline with Snowflake.
- Batch data ingestion with Snowflake
- Learners learn how to ingest data into Snowflake at scale using various, powerful techniques. They specifically load data into Snowflake using the Snowflake Marketplace, the Snowflake web interface (Snowsight), the Snowflake CLI, and the powerful COPY INTO SQL command. Learners also understand how to ingest data from external systems using Snowflake native connectors, and how to optimize data ingestion by fully utilizing a virtual warehouse.
- Data transformations with Snowflake
- Learners perform data transformations using SQL or Snowpark for Python. Learners also understand that they can use Snowpark to perform data transformations with Java and Scala. They also extend their knowledge of data transformations by learning about, creating, and using user-defined functions (UDFs), stored procedures, streams, and Snowflake Dynamic Tables. They also learn how to perform data transformations using these tools outside of Snowflake, specifically in Visual Studio Code using Snowflake's official extension.
- Delivering data products with Snowflake
- Learners understand what orchestration means, and how to add automation to data pipelines using tasks. They specifically learn about user-managed and serverless tasks, and create tasks to automate calls to stored procedures. They also create and link tasks together to form a task graph, or DAG, and execute individual tasks and entire DAGs.
- Orchestrating continuous data pipelines with Snowflake
- Learners understand what orchestration means, and how to add automation to data pipelines using tasks. They specifically learn about user-managed and serverless tasks, and create tasks to automate calls to stored procedures. They also create and link tasks together to form a task graph, or DAG, and execute individual tasks and entire DAGs.
Taught by
Snowflake Northstar