Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Investigating Checkpoint and Restore for GPU-Accelerated Containers

Linux Foundation via YouTube

Overview

Explore the potential of Checkpoint and Restore technology for GPU-accelerated containers in this 39-minute conference talk presented by Nan Lu from Microsoft and Adrian Reber from Red Hat. Delve into the early investigations and proof-of-concepts surrounding this nascent technology, aimed at optimizing the use of costly GPUs and time-intensive model training processes. Gain insights into existing functionalities and identify gaps in the ecosystem that need to be addressed to enable this solution. Learn about the challenges and opportunities in leveraging Checkpoint and Restore techniques for GPU-powered containers, and understand how this approach could potentially revolutionize resource management in high-performance computing environments.

Syllabus

Investigating Checkpoint and Restore for GPU-Accelerated Containers - Nan Lu & Adrian Reber

Taught by

Linux Foundation

Reviews

Start your review of Investigating Checkpoint and Restore for GPU-Accelerated Containers

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.