Overview
Explore a comprehensive conference talk that delves into DuckDB, an analytical relational database management system designed for in-process operation within host applications. Learn how this versatile system supports multiple programming languages including C/C++, Python, R, and Java while offering full SQL capabilities and native support for CSV, Parquet, and JSON formats. Discover the modern system architecture that enables parallel query execution and disk spillage for handling larger-than-memory workloads. Through detailed demonstrations, understand how to process hundreds of gigabytes of data on a laptop or scale up to terabytes on a single server. Master key concepts including in-process architecture, storage internals, execution mechanisms, portability features, and ergonomic design principles. Gain practical insights into DuckDB's current state, available extensions, real-world use cases, limitations, and business model, all presented through hands-on examples and technical explanations.
Syllabus
Intro
What is DuckDB?
Demo
What is DuckDB? continued
Demo
In-process architecture
Internals: Storage & execution
Portability
Ergonomic
State of DuckDB
Extensions
Use cases
Limitations
Business model
Summary
Outro
Taught by
GOTO Conferences