Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Rapid PySpark Custom Processing on Time Series Big Data in Databricks

Databricks via YouTube

Overview

Discover how Sleep Number leveraged Pyspark and Databricks to efficiently process massive time series data from smartbed sensors in this 28-minute conference talk. Learn about the challenges of analyzing noisy sensor readings and the implementation of custom entropy calculations on rolling windows. Explore the transition from a memory-constrained Pandas approach to a scalable Pyspark solution that processed 50 million records in just 0.3 seconds. Gain insights into optimizing big data processing for constant time complexity, regardless of data size. Presented by Gary Garcia Molina and Megha Rajam Rao from Sleep Number, this talk demonstrates advanced techniques for handling complex time series analysis in a distributed computing environment.

Syllabus

Rapid Pyspark Custom Processing on Time Series Big Data in Databricks

Taught by

Databricks

Reviews

Start your review of Rapid PySpark Custom Processing on Time Series Big Data in Databricks

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.