Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of performance estimation in search engine design through this illuminating conference talk from Strange Loop. Delve into a case study on building a search engine from scratch, learning how to reason about performance before coding begins. Discover techniques for estimating performance using back-of-the-envelope calculations that can be completed in hours, even for applications that take months or years to implement. Examine a core search algorithm used in Bing, understanding how its performance was initially estimated and how it compared to actual results. Investigate why certain algorithms, despite having reasonable performance characteristics, have been considered obsolete as core search engine technology for almost two decades. Gain insights into topics such as data volume management, query optimization, false positive probabilities, and hierarchical bloom filters. Walk away with valuable knowledge on performance reasoning, applicable to various large-scale software projects.
Syllabus
Thinking about performance
Search: a case study
Coding feels like real work
What's the actual problem?
$2000 for 128GB server
Search Algorithms
What's the problem again?
10s of billions per shard!
Probability of false positive?
Linear cost Exponential benefit
How do we estimate perf?
Expected performance?
Ingestion
Hierarchical bloom filters
Conclusions?
Conclusions!
Taught by
Strange Loop Conference