Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Google

Data Management and Storage in the Cloud

Google via Google Cloud Skills Boost

Overview

This is the second of five courses in the Google Cloud Data Analytics Certificate. In this course, you’ll explore how data is structured and organized. You’ll gain hands-on experience with the data lakehouse architecture and cloud components like BigQuery, Google Cloud Storage, and DataProc to efficiently store, analyze, and process large datasets.

Syllabus

  • Introduction to data management and storage in the cloud
    • Introduction to Course 2
    • Course 2 overview
    • Eric: Data analytics skills translate across industries and roles
    • Helpful resources and tips
    • Lab technical tips
    • Explore your Course 2 scenario: TheLook eCommerce
    • Welcome to module 1
    • Data storage and connections
    • Gerrit: Experience with a variety of tools can help you as an analyst
    • Common ways to store data
    • [Supplemental] Common data storage systems
    • AI-based predictive data management
    • Structured, unstructured, and semi-structured data
    • Overview of data lakehouse architecture
    • Example of a data lakehouse
    • Comparison of data warehouses and data lakehouses
    • Test your knowledge: Data storage options
    • Aspects of table schema
    • Overview of BigQuery's schema editing abilities
    • Components of BigQuery table schema
    • Complex data types in BigQuery
    • Introduction to nested data structure
    • Guide to BigQuery
    • Explore flat and nested data types in BigQuery
    • Test your knowledge: Data types and organization in BigQuery
    • Overview of data processing methods
    • Batch versus streaming data processing
    • Identify different batch and streaming data sources
    • Test your knowledge: Batch and streaming data sources
    • Wrap-up
    • Glossary terms from module 1
    • Module 1 challenge
  • Key components of data organization
    • Welcome to module 2
    • Denormalized data
    • Normalized and denormalized data
    • Test your knowledge: Ways to organize data
    • Data governance for effective data management
    • MK: Risk management in a cloud-first world
    • Components and objectives of data governance
    • Introduction to master data management
    • Test your knowledge: Data governance
    • Introduction to data catalogs
    • Data catalog components
    • Technical and business metadata
    • Test your knowledge: Foundations of accessible data
    • Overview of data lakehouse architecture
    • Components of data lakehouse architecture
    • Data lakehouse implementation best practices
    • Explore a lakehouse
    • Test your knowledge: Data lakehouse architecture
    • Wrap-up
    • Glossary terms from module 2
    • Module 2 challenge
  • Steps to find data
    • Welcome to module 3
    • Ryan: Curiosity can help you understand and connect data
    • How to find data using BigQuery
    • Data lineage and traceability
    • Dataplex's data lineage feature
    • How to use the Dataplex data lineage feature
    • Test your knowledge: Strategies for understanding data sources
    • Introduction to Analytics Hub
    • Analytics Hub enables data sharing
    • How to use Analytics Hub
    • Test your knowledge: Tools for sharing data
    • Data discovery, curation, and unification
    • Overview of Dataplex
    • Benefits of using Dataplex
    • How to search for data with BigQuery
    • Navigate Dataplex
    • Test your knowledge: Dataplex and BigQuery for accessing data
    • Wrap-up
    • Glossary terms from module 3
    • Module 3 challenge
  • Techniques to access data
    • Welcome to module 4
    • Methods for defining BigQuery table schemas
    • Auto-detection of schemas in BigQuery
    • Basic SQL commands for querying data
    • [Supplemental] SQL query terms
    • Compare data analytics with BigQuery and Dataproc
    • Test your knowledge: Data schemas and queries in BigQuery
    • Steps and models for accessing data with machine learning
    • Cloud-based machine learning can train predictive models
    • Introduction to machine learning with Vertex AI and BigQuery
    • Overview of Google Colab
    • Managed notebooks
    • Test your knowledge: Integration of Google Cloud tools
    • Essentials of database partitioning
    • Benefits of data partitioning
    • Methods for partitioning tables
    • Data partitioning reduces cloud costs
    • Create a partitioned table
    • Test your knowledge: Overview of data partitioning
    • Strategies for querying partitioned tables
    • Tips for interacting with partitioned tables
    • Manage a partitioned table in BigQuery
    • Test your knowledge: Techniques for managing partitioned tables
    • Key processes and benefits of Dataproc
    • How to create a Dataproc cluster
    • How to manage Dataproc clusters
    • Test your knowledge: Dataproc for automation and improved data processing
    • Wrap-up
    • Vince and George: Interview role play
    • Interview tip: Provide examples
    • Glossary terms from module 4
    • Module 4 challenge
    • Course wrap-up
    • Course 2 resources and citations
    • Glossary terms from Course 2
  • Your Next Steps
    • Course Badge

Reviews

Start your review of Data Management and Storage in the Cloud

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.