Machine Learning Data Lifecycle in Production

Overview

**Starting May 8, enrollment for the Machine Learning Engineering for Production Specialization will be closed. Please enroll in this specialization or to individual courses by then to gain access to this course material.** In the second course of Machine Learning Engineering for Production Specialization, you will build data pipelines by gathering, cleaning, and validating datasets and assessing data quality; implement feature engineering, transformation, and selection with TensorFlow Extended and get the most predictive power out of your data; and establish the data lifecycle by leveraging data lineage and provenance metadata tools and follow data evolution with enterprise data schemas. Understanding machine learning and deep learning concepts is essential, but if you’re looking to build an effective AI career, you need production engineering capabilities as well. Machine learning engineering for production combines the foundational concepts of machine learning with the functional expertise of modern software development and engineering roles to help you develop production-ready skills. Week 1: Collecting, Labeling, and Validating data Week 2: Feature Engineering, Transformation, and Selection Week 3: Data Journey and Data Storage Week 4: Advanced Data Labeling Methods, Data Augmentation, and Preprocessing Different Data Types

Syllabus

Week 1: Collecting, Labeling and Validating Data

This week covers a quick introduction to machine learning production systems. More concretely you will learn about leveraging the TensorFlow Extended (TFX) library to collect, label and validate data to make it production ready.

Week 2: Feature Engineering, Transformation and Selection

Implement feature engineering, transformation, and selection with TensorFlow Extended by encoding structured and unstructured data types and addressing class imbalances