Explore PySpark's capabilities for analyzing large datasets using World of Warcraft auction house data in this EuroPython 2015 conference talk. Learn how Apache Spark compares to numpy, pandas, and scikit-learn in terms of performance and when to use it for data analysis. Discover the architecture behind Spark and its benefits for speeding up iterative processes on both Hadoop clusters and local machines. Follow along with a live demonstration using an iPython notebook to analyze a 22GB JSON dataset of World of Warcraft auction house information from multiple servers. Investigate whether basic economic principles apply in massively multiplayer online games through this unique data science application. Gain insights into big data processing, simple queries, and cost analysis using PySpark, while enjoying an engaging presentation that combines data science concepts with video game economics.
Overview
Syllabus
Intro
Overview
Warcraft
Big Data
Many Small Bits
Getting Started
Simple Queries
Cost
Results
Conclusion
Taught by
EuroPython Conference