What you'll learn:
- Apply Databricks AutoML to different ML Problem like Regression, Classification
- Use MLFlow to Track Complete ML Lifecycle inside Data bricks environment
- Register model & Deploy to Production with MLFlow & Databricks
- Store Model Features inside Feature Store
Welcome to our comprehensive course on Databricks Certified Machine Learning Engineer Associate certification. This course is designed to help you master the skills required to become a certified Databricks ML engineer associate.
Databricks is a cloud-based data analytics platform that offers a unified approach to data processing, machine learning, and analytics. With the growing demand for data engineers, Databricks has become one of the most sought-after skills in the industry.
The minimally qualified candidate should be able to:
Use Databricks Machine Learning and its capabilities within machine learning workflows, including:
Databricks Machine Learning (clusters, Repos, Jobs)
Databricks Runtime for Machine Learning (basics, libraries)
AutoML (classification, regression, forecasting)
Feature Store (basics)
MLflow (Tracking, Models, Model Registry)
Implement correct decisions in machine learning workflows, including:
Exploratory data analysis (summary statistics, outlier removal)
Feature engineering (missing value imputation, one-hot-encoding)
Tuning (hyperparameter basics, hyperparameter parallelization)
Evaluation and selection (cross-validation, evaluation metrics)
Implement machine learning solutions at scale using Spark ML and other tools, including:
Distributed ML Concepts
Spark ML Modeling APIs (data splitting, training, evaluation, estimators vs. transformers, pipelines)
Hyperopt
Pandas API on Spark
Pandas UDFs and Pandas Function APIs
Understand advanced scaling characteristics of classical machine learning models, including:
Distributed Linear Regression
Distributed Decision Trees
Ensembling Methods (bagging, boosting)