Building an Open Source Streaming Analytics Stack with Kafka and Druid

Overview

Explore how to construct a streaming analytics stack using Kafka and Druid in this 41-minute conference talk from the Linux Foundation. Learn about the challenges of batch processing systems and discover how combining Kafka and Druid can create a robust data pipeline supporting real-time and batch ingestion with flexible, low-latency queries. Delve into topics such as event handling, data delivery problems, stream processing challenges, and approximation algorithms. Gain insights into Druid's architecture and understand how this open-source technology combination can guarantee system availability, maintain data integrity, and support fast, flexible queries for deriving insights from vast quantities of data.

Syllabus

Introduction
Overview
The Problem
Events
Example
Problems
Models
Data Delivery
Data Delivery Problems
Kafka Summary
Stream Processing
Stream Processing Challenges
Stream Processing System
Challenges
Subheading Queries
Technical Overview
Approximation Algorithms
Druid Architecture
Rules of Example
Raw Data
Shuffle
Join
Joint
Conclusions

Taught by

Linux Foundation

Reviews

Start your review of Building an Open Source Streaming Analytics Stack with Kafka and Druid

Taught by

Tags

Building a High-Performance Real-Time Analytics Database with Apache Kafka and Druid

Embrace the Anarchy - Apache Kafka’s Role in Modern Data Architectures

Apache Kafka and KSQL in Action - Let’s Build a Streaming Data Pipeline

Building Real-time Systems with Open Source Technologies

Building a Real-Time AI Data Platform with Apache Kafka

Kafka Based Microservices with Akka Streams and Kafka Streams

Never Stop Learning.