Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Building a High Performance Network in the Public Cloud Using RDMA - First Principles

Oracle via YouTube

Overview

Explore how Oracle Cloud Infrastructure architects utilize Remote Direct Memory Access (RDMA) to deliver high-performance networking with low latency in this 40-minute video from Oracle's First Principles series. Dive into the intricacies of RDMA, its history at OCI, and the challenges it presents. Learn about the importance of RoCE, its pitfalls, and how OCI overcomes them. Discover OCI's approach to QoS tailoring, ECN tuning for various workloads, and the necessity of a separate RDMA network. Gain insights into performance optimizations, including flow-aware traffic distribution and locality optimization. Understand why OCI's RDMA network stands out and how it balances scale and latency for demanding workloads.

Syllabus

Introduction to OCI Cluster Networks
What is RDMA?
History of RDMA at OCI
Why is RDMA Challenging?
Importance of RoCE
Pitfalls of RoCE
Overcoming Pitfalls of RoCE
Limited use of PFC
Tailored QoS for multiple workloads
How to use ECN in RDMA networks
Tuning ECN to HPC workloads
Tuning ECN to GPU and DB workloads
Are OCI Cluster Networks in the same network?
Why do we need a separate RDMA network?
Performance optimizations for workloads
Flow aware traffic distribution
Traffic locality optimization
Traffic topology information vending service
Why OCI RDMA network is better, differentiated
Balancing scale and latency

Taught by

Oracle

Reviews

Start your review of Building a High Performance Network in the Public Cloud Using RDMA - First Principles

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.