High-Performance AI Model Serving with Ray Serve - A Rubrik Case Study

Overview

Explore a technical conference talk from Ray Summit 2024 where Rubrik engineers Shaikh Ismail and Shivanshu Agrawal demonstrate how they leveraged Ray Serve to achieve high-performance AI inference at scale. Discover the technical journey of implementing Ray's ML model serving library to handle millions of daily evaluations while meeting demanding scalability and throughput requirements. Learn about the distinctive features that made Ray Serve the optimal choice for online inference scenarios, and gain practical insights into addressing critical challenges including fault tolerance, robustness, and Kubernetes deployment. Gain valuable knowledge applicable to organizations seeking to enhance their AI serving infrastructure for high-stakes, real-time applications.

Syllabus

How Rubrik Unlocked AI at Scale with Ray Serve | Ray Summit 2024

Taught by

Anyscale

Reviews

Start your review of High-Performance AI Model Serving with Ray Serve - A Rubrik Case Study

Taught by

Klaviyo's Journey to Robust Model Serving with Ray Serve

Faster Model Serving with Ray and Anyscale - Ray Summit 2024

Scaling Inference Deployments with NVIDIA Triton Inference Server and Ray Serve

Building Intelligent AI Infrastructure with ORI - Dynamic Query Routing and Model Management

Introduction to Model Deployment with Ray Serve

Scaling LLM Inference - AWS Inferentia Meets Ray Serve on EKS

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

9 Best Kubernetes Courses for 2024

Never Stop Learning.