Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

The nanoPU - A Nanosecond Network Stack for Datacenters

USENIX via YouTube

Overview

Explore a groundbreaking NIC-CPU co-design called the nanoPU, aimed at accelerating datacenter applications that rely on numerous small Remote Procedure Calls (RPCs) with microsecond-scale processing times. Delve into the innovative fast path that bypasses the cache and memory hierarchy, directly placing incoming messages into the CPU register file. Discover the programmable hardware support for low-latency transport, congestion control, and efficient RPC load balancing across cores. Learn about the hardware-accelerated thread scheduler that makes sub-nanosecond decisions, optimizing CPU utilization and minimizing RPC tail response times. Examine the FPGA prototype built by modifying an open-source RISC-V CPU and evaluate its performance through cycle-accurate simulations on AWS FPGAs. Compare the nanoPU's wire-to-wire RPC response time of just 69ns to commercial NICs and understand how it significantly improves RPC tail response time and system load sustainability. Investigate the implementation and evaluation of applications like MICA, Raft, and Set Algebra for document retrieval, and learn how the nanoPU serves as a high-performance, programmable alternative for one-sided RDMA operations.

Syllabus

Introduction
Trends
The nanoPU
Prototype
Applications
Onesided RDMA
Conclusion

Taught by

USENIX

Reviews

Start your review of The nanoPU - A Nanosecond Network Stack for Datacenters

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.