Explore a technical conference talk showcasing the collaborative effort between Vivvix and Databricks to revolutionize video ad classification at scale. Dive into the implementation details as Dong-Hwi Kim (Vivvix) and Puneet Jain (Databricks) demonstrate their innovative use of Ray on Databricks and Spark Structured Streaming for processing video content across millions of categories. Learn how the team expanded their classification capabilities from 30,000 product classes to six million while addressing challenges in training time and limited data through open-source Large Language Models. Discover the strategic integration of optimized LLama2 models that yielded a 15% accuracy improvement and understand their novel approach of utilizing LLMs as a pre-processing step for machine learning analysis. Gain practical insights for scaling AI classification systems to handle massive datasets and complex categorization requirements.
Overview
Syllabus
How Databricks and Vivix Scale Video Ad Classification with Ray | Ray Summit 2024
Taught by
Anyscale