Feature Engineering

Overview

This course explores the benefits of using Vertex AI Feature Store, how to improve the accuracy of ML models, and how to find which data columns make the most useful features. This course also includes content and labs on feature engineering using BigQuery ML, Keras, and TensorFlow.

Syllabus

Module 0: Introduction

This module provides an overview of the course and its objectives.

Module 1: Introduction to Vertex AI Feature Store

This module introduces Vertex AI Feature Store.

Module 2: Raw Data to Features

Feature engineering is often the longest and most difficult phase of building your ML project. In the feature engineering process, you start with your raw data and use your own domain knowledge to create features that will make your machine learning algorithms work. In this module we explore what makes a good feature and how to represent them in your ML model.

Module 3: Feature Engineering

This module reviews the differences between machine learning and statistics, and how to perform feature engineering in both BigQuery ML and Keras. We'll also cover some advanced feature engineering practices.

Module 4: Preprocessing and Feature Creation

In this module you will learn more about Dataflow, which is a complementary technology to Apache Beam and both of them can help you build and run preprocessing and feature engineering.

Module 5: Feature Crosses - TensorFlow Playground

In traditional machine learning, feature crosses don’t play much of a role, but in modern day ML methods, feature crosses are an invaluable part of your toolkit. In this module, you will learn how to recognize the kinds of problems where feature crosses are a powerful way to help machines learn.

Module 6: Introduction to TensorFlow Transform

TensorFlow Transform (tf.Transform) is a library for preprocessing data with TensorFlow. tf.Transform is useful for preprocessing that requires a full pass the data, such as: - normalizing an input value by mean and stdev - integerizing a vocabulary by looking at all input examples for values - bucketizing inputs based on the observed data distribution In this module we will explore use cases for tf.Transform.