Overview
Explore a comprehensive conference session from AWS re:Invent 2024 focused on deploying and optimizing generative AI workloads on Amazon Elastic Kubernetes Service (EKS). Discover practical solutions for common deployment challenges using open source tools on Kubernetes, with emphasis on maximizing GPU acceleration, achieving optimal scalability, and maintaining low-latency responses. Gain valuable insights through real-world case studies demonstrating successful implementations, while learning essential best practices for cost management and performance optimization in generative AI deployments. Master the infrastructure requirements and technical considerations necessary for running high-performance AI models effectively on Amazon EKS.
Syllabus
AWS re:Invent 2024 - High-performance generative AI on Amazon EKS (KUB314)
Taught by
AWS Events