Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Models

USENIX via YouTube

Overview

Explore an innovative approach to parallelizing embedding tables for large-scale recommendation models in this 20-minute conference talk from USENIX ATC '24. Dive into OPER, an algorithm-system co-design that addresses the challenges of deploying Deep Learning Recommendation Models (DLRMs) across multiple GPUs. Learn how OPER's optimality-guided embedding table parallelization technique improves upon existing methods by considering input-dependent behavior, resulting in more balanced workload distribution and reduced inter-GPU communication. Discover the heuristic search algorithm used to approximate near-optimal EMT parallelization and the implementation of a distributed shared memory-based system that supports fine-grained EMT parallelization. Gain insights into the significant performance improvements achieved by OPER, with reported average speedups of 2.3× in training and 4.0× in inference compared to state-of-the-art DLRM frameworks.

Syllabus

USENIX ATC '24 - OPER: Optimality-Guided Embedding Table Parallelization for Large-scale...

Taught by

USENIX

Reviews

Start your review of OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.