Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Parallelizing Your ETL with Dask on Kubeflow

MLOps World: Machine Learning in Production via YouTube

Overview

Learn how to parallelize ETL processes using Dask on Kubeflow in this comprehensive conference talk. Explore the integration of Dask, a powerful Python library for parallel computing, with Kubeflow, a popular MLOps platform built on Kubernetes. Discover how to leverage Dask's advanced parallelism capabilities within Kubeflow's notebook service and pipeline workflows. Gain insights into the new Dask Operator for Kubernetes, which enables users to launch Dask clusters from Jupyter sessions and pipeline steps. Understand how to utilize Dask's distributed computing power to process larger-than-memory datasets and optimize performance in machine learning pipelines. Follow along as the speaker demonstrates installation procedures, provides practical examples, and showcases the benefits of combining Dask and Kubeflow for efficient data processing and ML workflows.

Syllabus

Parallelizing Your ETL with Dask on Kubeflow

Taught by

MLOps World: Machine Learning in Production

Reviews

Start your review of Parallelizing Your ETL with Dask on Kubeflow

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.