Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Pluralsight

Getting Started with the Databricks Lakehouse Platform

via Pluralsight

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
This course will teach you how the Data Lakehouse architecture brings you the best of both Data Lakes and Data Warehouses allowing you to meet your data needs for big data processing, SQL analytics, and machine learning in a single platform.

Organizations have long collected data in a variety of formats, structured, unstructured, and semi-structured data. However, working with data in different formats for different use-cases requires multiple platform data warehouses for structured data needed for business intelligence and data lakes for unstructured data needed for data science and machine learning. The Databricks data lakehouse architecture is an innovative paradigm that combines the flexibility and cost efficiency of a data lake with the reliability and features of a data warehouse. In this course, Getting Started with the Databricks Lakehouse Platform, you will learn the importance of storing data in a centralized repository and how data lakes and data warehouses serve to solve different data-related problems. First, you’ll explore a variety of technologies in the analytics space and how the lakehouse platform encompasses their strengths while mitigating their limitations. Next, you will understand the basic components that make up the architecture of a data lakehouse and how the Databricks Lakehouse Platform uses Delta Lakes to enable both SQL analytics and data science and machine learning using the same underlying data lake storage. Finally, you will explore the Databricks Data Lakehouse on Microsoft Azure. You will build the data lakehouse, store data in Delta Tables, and access the same data using Apache Spark and SQL queries. When you are finished with this course, you will be able to clearly articulate how the data lakehouse platform helps mitigate challenges with current data architectures and will know hands-on how you can set up and use the lakehouse on Databricks.

Syllabus

  • Course Overview 1min
  • Introducing the Lakehouse Platform 40mins
  • An Architectural Overview of the Lakehouse Platform 21mins
  • Using a Lakehouse on Databricks 34mins

Taught by

Janani Ravi

Reviews

4.8 rating at Pluralsight based on 46 ratings

Start your review of Getting Started with the Databricks Lakehouse Platform

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.