Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive deep into Azure Databricks, a fast and collaborative Apache® Spark™ based analytics platform optimized for Azure. Explore Spark's technical overview, Azure Databricks' key collaboration features, cluster management, and tight data integration with Azure data sources. Follow a detailed walkthrough of an advanced analytics pipeline built using Spark and Azure Databricks. Learn about Apache Spark APIs, Spark Application internals, and hidden technical debt in ML systems. Discover Databricks core concepts, anomaly detection techniques using the KDD Cup 1999 Network Intrusion dataset, and how to create custom transformers and estimators. Gain insights into productionizing machine learning workloads, Spark Structured Streaming, and Databricks developer tooling. By the end of this 49-minute conference talk, acquire the knowledge to build and deploy sophisticated analytics pipelines using Azure Databricks.
Syllabus
Intro
Apache Spark a unified computing engine
Apache Spark: APIS
Inside a Spark Application
Azure Databricks Managed Apache Spark platform optimized for Azure First party service
Hidden Technical Debt in ML Systems
Azure Integration
Databricks Core Concepts
Anomaly Detection - Network Intrusion KOD Cup 1999 Data
Demo Architecture
Estimators and Transformers
Custom Transformers and Estimators
Productionizing Machine Learning Workloads
Spark Structured Streaming
Databricks Developer Tooling
Try the demo!
Taught by
NDC Conferences