Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine

GAIA via YouTube

Overview

Explore a comprehensive comparison of Pandas 2, Dask, and Polars for efficiently handling large datasets on a single machine in this informative 28-minute conference talk. Delve into the latest advancements in data processing tools, including Pandas 2's new Arrow data types, faster calculations, and improved scalability. Learn about Dask's ability to scale Pandas across cores and its recent "expressions" optimization. Discover Polars, a new competitor designed around Arrow with native multicore support. Gain insights into solving a "just about fits in RAM" data task using these three solutions, understanding their pros and cons to make informed decisions for research workflows. Examine whether Pandas operations still require 5x working RAM, the speed improvements in Pandas string operations, and the compatibility of Polars with tools like Scikit-learn and matplotlib. Presented by Ian Ozsvald, an experienced Chief Data Scientist and author, this talk offers valuable knowledge for data scientists and researchers looking to optimize their data processing techniques.

Syllabus

Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine by Ian Ozsvald

Taught by

GAIA

Reviews

Start your review of Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.