How Roblox Scaled Machine Learning by Leveraging Ray for Efficient Batch Inference

Overview

Explore a conference talk from Ray Summit 2024 where Roblox engineers Steve Han, Wei Zeng, and Yiqing Wang demonstrate how they scaled machine learning operations through Ray's batch inference capabilities. Gain insights into Roblox's strategic approach to expanding their ML infrastructure, with particular emphasis on recent developments in multimodal language models. Learn about the technical implementation of multimodal models within vLLM, an open-source initiative that has attracted substantial community attention. Discover practical solutions to scaling challenges encountered during integration, and understand how these lessons can be applied to large-scale ML deployments in gaming platforms. Examine real-world examples of implementing advanced ML technologies at scale, complete with detailed technical insights and best practices for similar infrastructure scaling initiatives.

Syllabus

How Roblox Scaled Machine Learning by Leveraging Ray for Efficient Batch Inference | Ray Summit 2024

Taught by

Anyscale

Reviews

Start your review of How Roblox Scaled Machine Learning by Leveraging Ray for Efficient Batch Inference

Taught by

Roblox's Journey to Supporting Multimodality on vLLM - Ray Summit 2024

The Evolution of Multi-GPU Inference in vLLM

Databricks' vLLM Optimization for Cost-Effective LLM Inference - Ray Summit 2024

Scaling Generative AI Models for Millions of Users in Roblox

High-Performance AI Model Serving with Ray Serve - A Rubrik Case Study

The State of vLLM - Advancements in LLM Inference and Serving

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.