Explore innovative techniques for designing and implementing efficient data pipelines using Celery, Redis, and signal-based triggering in this 25-minute conference talk from EuroPython 2024. Learn how to segment pipelines into smaller, manageable components to enhance fault tolerance, improve modularity, and simplify testing and debugging. Discover the benefits of using Redis as a data store and leveraging Celery's signals to create self-triggering pipelines that efficiently manage data batches within API rate limits and system resource constraints. Compare this new approach to traditional periodic tasks, understanding how it can increase data throughput and completeness. Gain insights into implementing secondary benefits such as result persistence and reporting for data analysis and optimization in budget-sensitive environments. Walk away with fresh perspectives and practical techniques for creating more effective and maintainable data pipelines in your own projects using Celery.
Overview
Syllabus
Data pipelines with Celery: modular, signal-driven and manageable — Marin Aglić Čuvić
Taught by
EuroPython Conference