Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Sketching Algorithms: Making Sense of Big Data in a Single Stroke

Conf42 via YouTube

Overview

Explore the world of sketching algorithms for big data analysis in this conference talk from Conf42 Python 2024. Dive into the concept of sketches as approximate data structures, understanding their characteristics, components, and advantages over exact computations. Learn about distributed processing challenges, the importance of sublinear data structure growth, and mergability in sketch design. Discover various types of sketches, with a focus on the Count-Min Sketch algorithm. Gain insights into open-source sketching libraries like Apache DataSketches and their extensions. Equip yourself with knowledge to tackle non-additive challenges in data processing and understand why sketches offer faster solutions for big data problems.

Syllabus

intro
preamble
hello
quix
quix streams
quix cloud
what is a sketch?
approximate answers
sketch characteristics
sketch components
why exact == slow
distributed processing
unique word count
massively parallel processing mpp
shuffling is slow
latency numbers every programmer should know
why sketches == fast
sketch design
sublinear data structure growth
mergability
non-additive challenges are everywhere
unique counts are non-additive
non-additive challenges solved
types of sketches
count min sketch
open source sketches
apache datasketches java, c++, python
datasketch extensions
thank you

Taught by

Conf42

Reviews

Start your review of Sketching Algorithms: Making Sense of Big Data in a Single Stroke

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.