Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for evaluating large language models in a hybrid cloud environment using Ray. Learn how IBM's project CodeFlare leverages Ray to create a unified pipeline for multi-task evaluation, offering improved auto-scaling, resource management, and workflow unification. Discover how this approach streamlines the evaluation process for large-scale neural language models across diverse downstream tasks, reducing time and complexity. Gain insights into the implementation of Ray for easier auto-scaling, better resource management, and unified workflows. Follow the journey from problem identification to the demonstration of a large-scale language model evaluation pipeline in a hybrid cloud with auto-scaling. Understand how Ray facilitates workflow pipeline unification with minimal code modifications, enhancing dependency management and overall performance.
Syllabus
Introduction
Agenda
What are Foundation models
Scale out middleware
Foundation Pipeline
Summary
Rework Flow
Model Pipeline
Finetuning
Model serving
Taught by
Anyscale