Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Embark on a data-informed journey exploring incremental change data capture in this conference talk. Discover how to iterate on incremental ingestion from SaaS applications, relational databases, and event streams into a centralized data lake. Learn to make architecture decisions based on evidence and specific use cases, promoting long-term stewardship and developer happiness. Follow the speaker's experience with sourcing from Salesforce, utilizing Overwatch's insights for load-balancing connectors and achieving significant cost savings. Explore three flavors of CDC, from naive to feature-rich approaches, including batch polling and log streaming. Understand how query-based CDC and Lakehouse Federation can reduce maintenance overload and eliminate bugs. Delve into Liquid Clustering's ability to address data skew across customers and improve write performance. Gain insights on streamlining maintenance and improving reliability with the latest Delta Lake features.