Overview
Explore the evolution of Hadoop at Spotify through a 52-minute conference talk presented at GOTO Copenhagen 2015. Delve into the challenges and lessons learned as Spotify's Hadoop cluster grew from a small office setup to a large-scale infrastructure. Discover how the company overcame obstacles, managed explosive growth, and improved its data processing capabilities. Learn about the transition from Python to JVM, the implementation of Crunch, and the introduction of tools like Inviso and Apache Spark with Zeppelin. Gain valuable insights into Hadoop's scalability, availability, and performance in a real-world, high-growth environment. Understand the key takeaways from Spotify's journey in leveraging big data technologies to power its music streaming service.
Syllabus
Introduction
Overview
What is Spotify?
Powered by Data
Moving Data to Hadoop
LogArchiver
Workflow Management Fail!
Hadoop Availability
How did we do?
What happened in the last quarter?
Lessons Learned
Going from Python to JVM
Moving from Python to Crunch
Crunch vs Hadoo Streaming Benchmark
Let's Review
Growth of Hadoop vs. Spotify Users
Explosive Growth
Inviso
Hadoop Report Card
Apache Spark with Zeppelin
Takeaways
Taught by
GOTO Conferences