Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Understand data science for machine learning

Microsoft via Microsoft Learn

Go to class Write review

Overview

Module 1: A high-level overview of machine learning for people with little or no knowledge of computer science and statistics. You’ll be introduced to some essential concepts, explore data, and interactively go through the machine learning life-cycle - using Python to train, save, and use a machine learning model like we would in the real world.

In this module, you will:

Explore how machine learning differs from traditional software
Create and test a machine learning model
Load a model and use it with new data

Module 2: Supervised learning is a form of machine learning where an algorithm learns from examples of data. We progressively paint a picture of how supervised learning automatically generates a model that can make predictions about the real world. We also touch on how these models are tested, and difficulties that can arise in training them.

In this module, you will:

Define supervised and unsupervised learning.
Explore how cost functions affect the learning process.
Discover how models are optimized by gradient descent.
Experiment with learning rates, and see how they can affect training.

Module 3: The power of machine learning models comes from the data that is used to train them. Through content and exercises, we explore how to understand your data, how to encode it so that the computer can interpret it properly, how to clean it of errors, and tips that will help you create models that perform well.

In this module, you will:

Visualize large datasets with Exploratory Data Analysis (EDA)
Clean a dataset of errors
Predict unknown values using numeric and categorical data

Module 4: Regression is arguably the most widely used machine learning technique, commonly underlying scientific discoveries, business planning, and stock market analytics. This learning material takes a dive into some common regression analyses, both simple and more complex, and provides some insight on how to assess model performance.

In this module, you will:

Understand how regression works
Work with new algorithms: Linear regression, multiple linear regression, and polynomial regression
Understand the strengths and limitations of regression models
Visualize error and cost functions in linear regression
Understand basic evaluation metrics for regression

Module 5: When we think of machine learning, we often focus on the training process. A small amount of preparation before this process can not only speed up and improve learning but also give us some confidence about how well our models will work when faced with data we have never seen before.

In this module, you will:

Define feature scaling
Create and work with test datasets
Articulate how testing models can both improve and harm training

Module 6: Classification means assigning items into categories, or can also be thought of automated decision making. Here we introduce classification models through logistic regression, providing you with a stepping-stone toward more complex and exciting classification methods.

In this module, you will:

Discover how classification differs from classical regression
Build models that can perform classification tasks
Explore how to assess and improve classification models

Module 7: Explore how altering the architecture of more complex models can bring about more effective results.

In this module, you will:

Discover new model types– decision trees and random forests.
Learn how model architecture can affect performance
Practice working with hyperparameters to improve training effectiveness

Module 8: How do we know if a model is good or bad at classifying our data? The way that computers assess model performance sometimes can be difficult for us to comprehend or can over-simplify how the model will behave in the real world. To build models that work in a satisfactory way, we need to find intuitive ways to assess them, and understand how these metrics can bias our view.

In this module, you will:

Assess performance of classification models
Review metrics to improve classification models
Mitigate performance issues from data imbalances

Module 9: Receiver operator characteristic curves are a powerful way to assess and fine-tune trained classification models. We introduce and explain the utility of these curves through learning content and practical exercises.

In this module, you will:

Understand how to create ROC curves
Explore how to assess and compare models using these curves
Practice fine-tuning a model using characteristics plotted on ROC curves

Syllabus

Module 1: Introduction to machine learning

Introduction
What are machine learning models?
Exercise - Create a machine learning model
What are inputs and outputs?
Exercise - Visualize inputs and outputs
How to use a model
Exercise - Use machine learning models
Knowledge check
Summary

Module 2: Build classical machine learning models with supervised learning

Introduction
Define supervised learning
Exercise - Implement supervised learning
Minimize model errors with cost functions
Exercise - Optimize a model by using cost functions
Optimize models by using gradient descent
Exercise - Implement gradient descent
Knowledge check
Summary

Module 3: Introduction to data for machine learning

Introduction
Good, bad, and missing data
Exercise - Visualize missing data
Examine different types of data
Exercise - Work with data to predict missing values
One-hot vectors
Exercise - Predict unknown values using one-hot vectors
Knowledge check
Summary

Module 4: Train and understand regression models in machine learning

Introduction
What is regression?
Exercise - Train a simple linear regression model
Multiple linear regression and R-squared
Exercise - Train a multiple linear regression model
Polynomial Regression
Exercise - Polynomial regression
Knowledge check
Summary

Module 5: Refine and test machine learning models

Introduction
Normalization and standardization
Exercise – Feature scaling
Test and training datasets
Exercise - Test and train datasets
Nuances of test sets
Exercise – Test set nuances
Knowledge check
Summary

Module 6: Create and understand classification models in machine learning

Introduction
What are classification models?
Exercise - Build a simple logistic regression model
Assessing a classification model
Exercise - Assessing a logistic regression model
Improving classification models
Exercise - Improving classification models
Knowledge check
Summary

Module 7: Select and customize architectures and hyperparameters using random forest

Introduction
Decision trees and model architecture
Exercise - Decision trees and model architecture
Random forests and selecting architectures
Exercise - Selecting random forest architectures
Hyperparameters in classification
Exercise - Hyperparameter tuning with random forests
Knowledge check
Summary

Module 8: Confusion matrix and data imbalances

Introduction
Confusion matrices
Exercise – Building a confusion matrix
Data imbalances
Exercise - Resolving biases in a classification model
Cost functions versus evaluation metrics
Exercise - Multiple metrics and ROC curves
Knowledge check
Summary

Module 9: Measure and optimize model performance with ROC and AUC

Introduction
Analyze classification with receiver operator characteristic curves
Exercise - Evaluate ROC curves
Compare and optimize ROC curves
Exercise - Tune the area under the curve
Knowledge check
Summary

Tags

united states

Reviews

Start your review of Understand data science for machine learning