Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

SQL for Efficient Data Organization in Machine Learning

Snorkel AI via YouTube

Overview

Explore how SQL can enhance data organization for machine learning in this 11-minute video presentation by Columbia PhD student Zachary Huang. Learn about JoinBoost, a lightweight Python library that transforms tree training algorithms over normalized databases into pure SQL queries. Discover how this innovative approach addresses the mismatch between ML data organization requirements and traditional database structures, offering a simplified, all-in-one data stack solution. Gain insights into JoinBoost's compatibility with various DBMS and data stacks, its exceptional performance and scalability, and how it outperforms specialized ML libraries like LightGBM in terms of speed and scalability for random forests and gradient boosting algorithms.

Syllabus

Introduction
Background
Example
Problem Statement

Taught by

Snorkel AI

Reviews

Start your review of SQL for Efficient Data Organization in Machine Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.