Building an ML Inference Platform with Knative - Serverless Containers on Kubernetes

Overview

Explore the development of a machine learning inference platform using Knative in this informative conference talk. Learn how Bloomberg LP and IBM leveraged Knative's serverless capabilities to simplify and accelerate ML-driven application deployment and scaling in production environments. Discover the advantages of Knative for running serverless containers on Kubernetes, including automated networking, volume-based autoscaling, and revision tracking. Gain insights into the evolution of the KServe project and how Knative enables blue/green/canary rollout strategies for safe ML model updates. Understand how to improve GPU utilization with scale-to-zero functionality and build Apache Kafka events-based inference pipelines. Examine testing benchmarks comparing Knative to Kubernetes HPA and learn performance optimization tips for running numerous Knative services in a single cluster.