Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Optimizing Cost and Performance with Arm64

USENIX via YouTube

Overview

Explore the journey of Honeycomb.io, a Series B startup in the observability space, as they evaluate and implement arm64 processor architecture to optimize cost and performance of their telemetry ingest and indexing workload. Dive into the process of setting up the evaluation, full migration, and improvements made to the ecosystem over a year-long period. Learn how 92% of all compute workloads were successfully migrated to arm64, resulting in a 40% drop in compute costs and modest improvements in end-user visible latency. Discover the roadblocks and challenges faced, including lack of full software compatibility, hidden performance quirks, and additional complexity. Gain insights into the history of processor architectures, the efficiency of ARM, and the importance of Service Level Objectives (SLOs) in user flows. Explore the service architecture, including the Shepherd ingest API service and Retriever, and understand the steps taken to migrate production environments. Examine the impact of AWS instance availability and Kafka on the migration process. Conclude with valuable lessons learned, including setting measurable goals, acknowledging hidden risks, prioritizing team well-being, and optimizing for safety in large-scale migrations.

Syllabus

Intro
WTF is architecture? Why multiarch?
History: 80s, 90s, 00s, 10s, and beyond
If it ain't broke...
ARM is more efficient.
Data storage engine and analytics tool
Service Level Objectives (SLO)
SLOs are user flows
Same reliability, lower costs with ARM6
Complexity stayed manageable
Prod: customers observe data
Kibble observes dogfood
Dogfood observes prod
Service Architecture
Shepherd: ingest API service
Is it feasible to migrate?
Producing artifacts for Arm64
Initial findings
A/B testing
Dogfood Shepherd cost reduction
Migrated prod Shepherd
Migrated prod Retriever
AWS ran out of m6gd spot instances
Kafka + the long tail
Graviton2 going strong
Have a measurable goal in mind
Acknowledge hidden risks
Take care of your people
Optimize for safety
Graviton2 blog posts

Taught by

USENIX

Reviews

Start your review of Optimizing Cost and Performance with Arm64

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.