Faster pandas

Overview

Learn how to make your pandas code quicker and more efficient. This course covers vectorization, common mistakes, pandas performance, saving memory, Numba, Cython, and more.

Syllabus

Introduction

pandas and performance
What you should know
Working with the files on GitHub

1. Overview

Why performance matters
Setting goals
Measuring performance
Profiling
Challenge: Identify bottleneck
Solution: Identify bottleneck

2. Vectorization

What is vectorization?
Boolean indexing
Understanding ufuncs
Challenge: Selecting and manipulating data
Solution: Selecting and manipulating data

3. Common Mistakes

The limitations of appending
The limitations of object dtype
The limitations of row iteration
Understanding the isin function
Parsing time once
Challenge: Query a DataFrame
Solution: Query a DataFrame

4. pandas Performance

Using built-in functions
Understanding eval and query
Understanding the join function
Challenge: Join and query
Solution: Join and query

5. Saving Memory

Why memory is important?
Measuring memory
Loading parts of data
Categorical data
Challenge: Reducing memory
Solution: Reducing memory

6. Fast Serialization

Various formats and why not CSV
Optimizing with SQL
Optimizing with HDF5
Challenge: Bike ride duration
Solution: Bike ride duration

7. Numba and Cython

What is Numba?
Using Numba
What's Cython?
Writing Cython code
Compiling Cython
%%cython magic
Challenge: Cython speedup
Solution: Cython speedup

8. Alternative DataFrames

Overview of alternative DataFrames
Using Dask
Using Vaex
Challenge: Vaex vs. pandas
Solution: Vaex vs. pandas

Conclusion

Next steps

Taught by

Miki Tebeka

Reviews

4.6 rating at LinkedIn Learning based on 82 ratings

Start your review of Faster pandas

Taught by

Introduction to pandas Course (How To)

Writing Efficient Code with pandas

Pandas Playbook: Visualization

BiteSize Python: NumPy and Pandas

Manipulating DataFrames with pandas

Writing Efficient Python Code

10 Best Pandas Courses for 2024

From Data to Insights: 10 Best Data Analysis Courses for 2024

110+ Hours of Free LinkedIn Learning Courses with Free Certification

Never Stop Learning.