Explore a cutting-edge streaming algorithm for k-median and k-means clustering in this 31-minute lecture by Samson Zhou from Texas A&M University. Delve into the innovative approach that achieves constant space complexity independent of data stream size and aspect ratio. Discover how this algorithm optimizes processing time to poly(log log (n*Delta)) per stream item in the unit cost RAM model. Learn about the novel compression technique for merge and reduce trees and its applications, including improved space and update time for approximate subspace embeddings in streaming scenarios. Gain insights into sublinear graph simplification and its implications for efficient data processing in high-dimensional spaces.
Overview
Syllabus
Fast Streaming Euclidean Clustering with Constant Space
Taught by
Simons Institute