Learn advanced techniques for analyzing large data sets with SQL. Find out how to build sophisticated data models, optimize queries, extend SQL with user-defined functions, and more.
Overview
Syllabus
Introduction
- Advanced SQL techniques for data science
- What you should know
- Rules of normalization
- Denormalization
- Partitioning data
- Materialized views
- Read replicas
- Challenge: Design a data model for analytics
- Solution: Design a data model for analytics
- B-tree indexes
- Bitmap indexes
- Hash indexes
- GiST and SP-GiST indexes
- GIN and BRIN indexes
- Challenge: Choosing an optimal indexing strategy
- Solution: Choosing an optimal indexing strategy
- EXPLAIN and ANALYZE commands
- Generating data with generate_sequence
- Generating time series data
- Analyzing a query with WHERE clauses and indexes
- Analyzing a query with a join
- Challenge: Optimize a query using an explain plan
- Solution: Optimize a query using an explain plan
- Extending SQL with user-defined functions
- SQL query functions
- Function overloading
- Function volatility
- PL/Python functions
- Challenge: Write a user-defined function
- Solution: Write a user-defined function
- Federated queries
- Bloom filters
- Hstore for key-value pairs
- JSON for semi-structured data
- Hierarchical data and ltrees
- Challenge: Design a table to support unstructured data
- Solution: Design a table to support unstructured data
- Next steps
Taught by
Dan Sullivan