Overview
Syllabus
Key ideas of the paper
Abstract
Note on k-NN non-parametric machine learning
Data and NPT setup explained
NPT loss is inspired by BERT
A high-level architecture overview
NPT jointly learns imputation and prediction
Architecture deep dive input embeddings, etc
More details on the stochastic masking loss
Connections to Graph Neural Networks and CNNs
NPT achieves great results on tabular data benchmarks
NPT learns the underlying relational, causal mechanisms
NPT does rely on other datapoints
NPT attends to similar vectors
Conclusions
Taught by
Aleksa Gordić - The AI Epiphany