Overview
Syllabus
Intro
maybe even the fastest in the world?
Who wants these machines?
OAK IBM POWER
Intel x86
Summit: Science research Astrophysics Materials Cancer Research Systems Biology
Titan 2012 27
Metric household
Summit: 13 MegaWatts
Summit: USD $200 Million
550 households
1 Sydney house
Summit: 300 km of cables
Sierra: National Nuclear Security Administration's Stockpile Stewardship Mission
How do you build this thing?
IBM 2 computers: • Infrastructure • Compute
POWER8 based?
100Gbps Networking
Mellanox CX-5
Hybrid approach CPUS + GPUS
Compute: Witherspoon AC922
How do we build them?
Timelines?
Sierra release: December 2017
Infrastructure nodes are first
Linux • Firmware • Systems • GPU interfaces
24 Core SMT4
8 Billion transistors
POWER9 is major refresh POWER
Major Architectural changes: • Radix/Linux Based MMU • New interrupt controller • Direct attach DDR4 DIMMs
New Slice Microarchitecture
First through 14nm fab
POWER9 chip development
Minor releases too
DD1: January 2017
Planning for Linux and Firmware
Design: Radix MMU
Radix MMU: • Simpler • Better performance • KVM allocations
Simulation: • Functional • Cycle Accurate
Teach Linux basic feature
Bringup: Everything is broken
Get Linux up
Bringup: • Identify issues • Work around • Get out of the way • Find real fix
Develop items that need real hardware
Testing • More systems • Systems getting more sophisticated • Devs - Machines futher separated
Release: Yay!
Staged release
POWER9 not backwards compatible with POWERS
IBM - RedHat strong relationship
IBM & RedHat partnered on RHEL7 for POWERS
Deliver Linux to customers
End of Moore's Law
Drive accelerators
Binary Linux kernel driver
Helped prove out: • Link training • Firmware
Coherent memory
CUDA Unified memory
Design • IOMMU looks like PCle ATS • IOMMU directly uses Radix MMU
Simulation with P9
Bringup: March 2017
Testing: Data integrity
Baseboard Management Controller
Little computer that turns on your big computer
Firmware?
Infrastructure nodes: Supermicro based BMC
Compute node OpenBMC
Compute nodes first OpenBMC release
Like a distro
Features: • On/Off • Monitor
Solutions
Pervasive
So how did it end up?
Fastest computer in the world?
Taught by
linux.conf.au