Overview
Explore the architecture and capabilities of Druid, an analytics data store designed for OLAP queries on event data, in this 45-minute conference talk from Strange Loop. Discover how Druid addresses the challenges of powering interactive data applications at scale, offering millisecond-level query latencies crucial for user-facing analytic applications. Learn about Druid's inspiration from Google's Dremel and PowerDrill, its columnar storage, plugin architecture, and approximate algorithms. Examine the solution space for analytics, including relational databases, key/value stores, and general compute engines, and understand why many large technology companies are adopting Druid. Gain insights into Druid's batch and streaming ingestion capabilities, its lambda architecture, and how it fits into an end-to-end data stack. Walk away with a comprehensive understanding of Druid's strengths in powering interactive analytics and its potential impact on your organization's data strategy.
Syllabus
Intro
History & Motivation
Use Cases
Business Intelligence Queries
Solution Space
Relational Database
Key/Value Stores
General Compute Engine
Column stores
Raw data
Summarization
Segmentation
Columnar Storage
Plugin Architecture
Approximate Algorithms
Architecture (Batch Ingestion)
Real-time Nodes
Architecture (Streaming Ingestion)
Architecture (Lambda)
End-to-end Data Stack
Integration
Takeaway
Taught by
Strange Loop Conference