Serving Machine Learning Models at Scale Using KServe
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Syllabus
Introduction
Background about KServe
Milestones
Model Development
Challenges
KServe
KServe Components
Standard Inference Protocol
HTTP Protocol
GRPC Protocol
New Scalability Problem
Current Approach
Problem
Compute resource limitations
Maximum pod limitations
Maximum IP address limitations
Model Mesh Solution
Performance Test
Latency Test
Model Mesh
Roadmap
Questions
Original Design
Taught by
CNCF [Cloud Native Computing Foundation]