Latency Distributions and Micro-Benchmarking to Identify and Characterize Kernel Hotspots

Latency Distributions and Micro-Benchmarking to Identify and Characterize Kernel Hotspots

USENIX via YouTube Direct link

SysV shared memory bottleneck (Linux RHEL 6) The micro-benchmark

7 of 17

7 of 17

SysV shared memory bottleneck (Linux RHEL 6) The micro-benchmark

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Latency Distributions and Micro-Benchmarking to Identify and Characterize Kernel Hotspots

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Why Large Bare Metal Boxes? • Faster local communication UNIX Domain Sockets Shared Memory
  3. 3 The Scale in our Department • 100K processes across hundreds of physical machines
  4. 4 SysV semaphore bottleneck (AIX)
  5. 5 Observations and Findings AIX CPU measurement when hyper-threading is very misleading No 'out of the box metrics on SysV IPC operations Sporadic slowness (depending on concurrency/contention)
  6. 6 SysV shared memory bottleneck (Linux) • Low-level application infrastructure code dropping messages Messaging leverages a form of "zero copy" IPC using Sysv
  7. 7 SysV shared memory bottleneck (Linux RHEL 6) The micro-benchmark
  8. 8 Case #2: Observations and Findings • No 'out of the box metrics on SysV IPC operations
  9. 9 UNIX domain socket bottleneck (Solaris) • Critical software infrastructure experiencing timeouts on load Identity management with very strict SLOS Narrowing down the problem A key SLI for the service…
  10. 10 An Aside: Histograms and Distributions are Useful! • More representative of the data set
  11. 11 An Aside: A Histogram Example
  12. 12 Early Observations • No out of the box metrics on socket operations
  13. 13 Case #3: UNIX domain socket bottleneck (Solaris) The micro-benchmarkt-testing against size
  14. 14 Case #3: Conclusions • Solaris 11.3 is limited to a max of 256K UDS sockets
  15. 15 Task clone and exit bottleneck (Linux)
  16. 16 More Summary (Plea to Kernel Folks) • The Prime Directive of Monitoring: Non-interference
  17. 17 References

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.