Overview
Explore the frontier of Bayesian methods and probabilistic machine learning in this conference talk on Bayesian nonparametric clustering. Delve into the mathematical and computational aspects of the Chinese Restaurant Process (CRP), a probabilistic generative model for clustering. Learn about the theory and applications through equations and graphical examples, followed by a detailed account of optimizing the model for big data through multiple rounds of performance improvements in Scala. Gain insights into time-honored strategies for speeding up complex calculations, applicable to various statistical models in big data scenarios. Led by Ryan Richt and Marisa Gioioso, experts in software engineering, data science, and mathematics, this talk covers topics such as mixture models, Dirichlet processes, functional programming, and performance optimization techniques. Discover the challenges and triumphs of implementing advanced Bayesian methods in real-world applications, making it valuable for data scientists, engineers, and researchers interested in cutting-edge machine learning techniques.
Syllabus
Intro
About us
Why this topic
Overview
What is crp
Clustering tree
No categories
Why not Kmeans
General Models
Mixture Model
Chinese Restaurant
Dinosaur Diamonds
Dirichlet Compound
Plate Diagram
Beta
Standard Trick
Why Markov
Priors
How to do it
Why it exploded
Reseeding
Profiler
Functional Programming
Fast Care
Immutable
C Function
Super Abstraction
Who are Bayesian Data Scientists
Performance Expectations
Performance Issues
Code Base
Taught by
Open Data Science