Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Amazon Web Services

AWS ML Engineer Associate 1.3 Validate Data and Prepare for Modeling

Amazon Web Services and Amazon via AWS Skill Builder

Overview

This course covers part of the data preparation phase of the machine learning (ML) lifecycle. In this course, you will learn about data validation strategies, including strategies for bias mitigation and data security. You will also review a few Amazon Web Services (AWS) services that can assist with data validation, including AWS Glue DataBrew and AWS Glue Data Quality. You will also learn about final steps of data preparation and configuration, such as dataset splitting, shuffling, augmentation, and configuration to load into your model training resource.

  • Course level: 300
  • Duration: 45 minutes

Activities

  • Online materials
  • A demonstration
  • Knowledge check questions
  • A course assessment

Course objectives

  • Explain the importance of ensuring data integrity.
  • Identify fundamental pre-training bias metrics.
  • Describe strategies to address class imbalance in datasets.
  • Describe key AWS services for validating data quality.
  • Use AWS tools to identify and mitigate sources of bias in data.
  • Describe techniques for using AWS services to encrypt data.
  • Identify implications of compliance requirements.
  • Describe the value and technique of splitting, shuffling, and augmenting datasets.
  • Identify data formats used in model training.
  • Identify AWS tools and services for model training data configuration.
  • Describe how to configure data to load it into a model training resource.

Intended audience

  • Cloud architects
  • Machine learning engineers

Recommended Skills

  • At least 1 year of experience using Amazon SageMaker and other AWS services for ML engineering.
  • At least 1 year of experience in a related role such as backend software developer, DevOps developer, data engineer, or data scientist.
  • A fundamental understanding of programming languages such as Python.
  • Preceding courses in the AWS ML Engineer Associate Learning Plan.

Course outline

  • Section 1: Introduction
    • Lesson 1: How to Use This Course
    • Lesson 2: Course Overview
    • Lesson 3: Fundamentals of Data Validation
  • Section 2: Validate Data
    • Lesson 4: Addressing Class Imbalance
    • Lesson 5: AWS Tools and Services for Data Validation and Bias Mitigation
    • Lesson 6: Identifying and Mitigating Bias with Amazon SageMaker Clarify
    • Lesson 7: Data Security and Compliance
  • Section 3: Final Steps of Data Preparation
    • Lesson 8: Dataset Splitting, Shuffling, and Augmentation
    • Lesson 9: Configure Data for Modeling Training
  • Section 4: Conclusion
    • Lesson 10: Course Summary
    • Lesson 11: Assessment
    • Lesson 12: Contact Us

Reviews

Start your review of AWS ML Engineer Associate 1.3 Validate Data and Prepare for Modeling

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.