Overview
Explore design tradeoffs for SSD reliability in this 24-minute conference talk from FAST '19. Delve into the challenges of increasing flash memory unreliability and examine various in-device reliability enhancement techniques. Learn about the multi-dimensional requirements for SSDs, including performance, reliability, and lifetime. Analyze the impact of uncoordinated use of techniques like data re-read, intra-SSD redundancy, and data scrubbing on SSD performance. Discover a proposed holistic reliability management scheme that selectively employs redundancy, conditionally re-reads, and judiciously selects data to scrub. Gain insights into flash memory errors, error modeling, and error correction codes. Evaluate the effectiveness of the proposed scheme across various I/O workloads and SSD wear states.
Syllabus
Intro
High-level objectives
How bad is it?
SSD's reliability issue
Flash memory errors
Flash memory error modeling RBER (cycles, time, reads)
Error model: 3x-nm MLC (2011)
Error model: 3D TLC (2018)
Error correction code
Evaluation: data re-read
Why is data re-read bad?
Evaluation: redundancy
Observations
Holistic reliability management . Cold data
The bright side of flash memory
The dark side of flash memory
Taught by
USENIX