Explore a 47-minute conference talk from SDC2022 that delves into the development of data platforms for democratizing AI systems. Learn about the challenges of running data processing and AI on separate platforms, and discover a proposed architecture that converges software and hardware infrastructure for unified data processing and training. Understand how distributed compute and storage platforms, parallel data processing, and deep learning framework connectors can create more efficient AI solutions. Examine real-world examples of how this data platform architecture improved pipeline efficiency for recommender system workloads like DLRM, DIEN, and WnD, achieving significant performance improvements on commodity CPU clusters. Through Intel expert Jack Zhang's presentation, gain insights into the intersection of big data and AI, the fundamentals of AI democratization, and strategies for building scalable data platforms that serve as the foundation for accessible AI systems.
Overview
Syllabus
SDC2022 – Data Platform for End-to-End AI Democratization
Taught by
SNIAVideo