Overview
Explore the fascinating journey of CPython's list sorting function in this 30-minute PyCon US talk. Delve into the algorithmic ideas, engineering tricks, and trivia behind the latest updates. Discover how Tim Peters' Timsort, a clever Mergesort variant, replaced Quicksort and became widely adopted in various languages and frameworks. Learn about the two flaws discovered in Timsort's algorithm, including a potential stack overflow issue and suboptimal merge order performance. Understand how the Powersort merge policy, based on optimal alphabetic trees, addresses these challenges and improves efficiency. Follow the evolution from Quicksort to Timsort and finally to the new implementation in Python 3.11.0. Gain insights into stable sorting, CPython sorting history, merge policies, and the connection between Mergesort and Binary Search Trees. Suitable for algorithm enthusiasts, performance-oriented programmers, and Python users curious about the inner workings of list sorting.
Syllabus
Intro
Outline
Stable Sorting
CPython Sorting History
Timsort merge policy (original)
Invariant trouble
Timsort merge policy (patched)
Timsort bad case
Merge policies from first principles
Mergesort meets Binary Search Trees
Run-Boundary Powers are Local
Some performance data
Bonus: Multiway powersort
Conclusion
Taught by
PyCon US