Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Pluralsight

Transform Data Using the Pandas API in Apache Spark

via Pluralsight

Overview

Learn to transform data with the Pandas API in Apache Spark. This course will teach you practical techniques for data manipulation, performance optimization, and using advanced window functions in Spark workflows.

Efficient data manipulation is essential in large-scale data processing. In this course, Transform Data Using the Pandas API in Apache Spark, you'll learn how to leverage the Pandas API for powerful data transformation in Spark. First, you’ll cover essential techniques like filtering, grouping, and merging. Next, you'll optimize workflows with Arrow. Finally, you'll dive into rolling and expanding window functions. When you’re finished with this course, you’ll have a better understanding of how to integrate the Pandas API with Apache Spark to handle complex data manipulation tasks with improved performance and efficiency.

Syllabus

  • Transform Data Using the Pandas API in Apache Spark 12mins

Taught by

Bismark Adomako

Reviews

Start your review of Transform Data Using the Pandas API in Apache Spark

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.