Overview
Explore IBM and Red Hat's collaborative efforts in developing a cloud-native platform for AI in this 33-minute conference talk from Ray Summit 2024. Delve into the complex workflows involved in training and deploying foundation models as Carlos Costa from IBM Research addresses the challenges of the current AI/ML landscape. Learn how IBM and Red Hat leverage KubeRay, Ray, and PyTorch to create a scalable, high-performance platform for the end-to-end cycle of foundation models, capable of scaling from hundreds to thousands of GPUs on IBM's Vela supercomputer. Discover the integration of this technology into Red Hat OpenShift AI and its application in a partnership with NASA, resulting in a pioneering geospatial foundation model. Gain valuable insights into building comprehensive, cloud-native AI infrastructures designed to meet the demands of modern AI/ML workflows.
Syllabus
IBM's Approach to Building a Cloud-Native AI Platform | Ray Summit 2024
Taught by
Anyscale