The Path to Insights: Data Models and Pipelines

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

This is the second of three courses in the Google Business Intelligence Certificate. In this course, you'll explore data modeling and how databases are designed. Then you’ll learn about extract, transform, load (ETL) processes that extract data from source systems, transform it into formats that enable analysis, and drive business processes and goals. Google employees who currently work in BI will guide you through this course by providing hands-on activities that simulate job tasks, sharing examples from their day-to-day work, and helping you build business intelligence skills to prepare for a career in the field. Learners who complete the three courses in this certificate program will have the skills needed to apply for business intelligence jobs. This certificate program assumes prior knowledge of foundational analytical principles, skills, and tools covered in the Google Data Analytics Certificate. By the end of this course, you will: -Determine which data models are appropriate for different business requirements -Describe the difference between creating and interacting with a data model -Create data models to address different types of questions -Explain the parts of the extract, transform, load (ETL) process and tools used in ETL -Understand extraction processes and tools for different data storage systems -Design an ETL process that meets organizational and stakeholder needs -Design data pipelines to automate BI processes

Syllabus

Data models and pipelines

You’ll start this course by exploring data modeling, common schemas, and database elements. You’ll consider how business needs determine the kinds of database systems that BI professionals implement. Then, you’ll discover pipelines and ETL processes, which are tools that move data and ensure that it’s accessible and useful.

Dynamic database design

You’ll learn more about database systems, including data marts, data lakes, data warehouses, and ETL processes. You’ll also investigate the five factors of database performance: workload, throughput, resources, optimization, and contention. Finally, you’ll consider how to design efficient queries that get the most from a system.

Optimize ETL processes

You’ll learn about optimization techniques including ETL quality testing, data schema validation, business rule verification, and general performance testing. You’ll also explore data integrity and learn how built-in quality checks defend against potential problems. Finally, you’ll focus on verifying business rules and general performance testing to make sure pipelines meet the intended business need.

Course 2 end-of-course project

You’ll complete an end-of-course project by creating a pipeline process to deliver data to a target table and developing reports based on project needs. You’ll also ensure that the pipeline is performing correctly and that there are built-in defenses against data quality issues.