Exploring ML Model Serving with KServe - Features and Use Cases

Overview

Explore the world of machine learning model serving with KServe in this informative conference talk featuring fun drawings. Dive into the fundamentals of KServe, an easy-to-use platform built on Kubernetes for deploying ML models. Learn about its high abstraction interfaces, performant solutions for common infrastructure issues, and features like GPU scaling and ModelMesh serving. Discover how KServe simplifies model deployment for data scientists and engineers, allowing them to focus on building new models. Examine key components such as the Predictor, Control Plane, Data Plane, and Inference Graph. Understand KServe's standard inference protocol, data plane plugins, and monitoring capabilities. Explore use cases, multi-model serving, and the project's roadmap towards its v1.0 release. Gain insights into KServe's evolution since 2019 and its exciting new functionalities that address the needs of ML practitioners.

Syllabus

Introduction
Features
Predictor
Control Plane
Replicas
CoopTree
Data Plane
Inference Graph
Standard Inference Protocol
Data Plane Plugins
Monitor Logger
Serving Runtime
MultiModel Serving
Use Cases
Inference Service
Conclusion
Questions