Explore the intricacies of IoT data management on AWS in this 20-minute conference talk. Dive into two primary use cases for captured metric data: real-time analysis and ad-hoc analysis. Learn about robust streaming frameworks, with a focus on Apache Flink and a brief discussion of Apache Beam. Examine best practices for data persistence, comparing various serialization formats and their suitability for different analysis scenarios. Gain insights into fully managed solutions like AWS Data Lake, weighing their pros and cons. Cover topics such as data volumes, short-term vs. long-term storage, Avro and Parquet formats, Flink UI and API, and AWS Data Lake implementation.
Overview
Syllabus
Intro
Data volumes
Shortterm vs longterm
Why Avro
Why Parquet
Flink
UI
API
AWS Data Lake
Taught by
ChariotSolutions