Completed
fsync can fail Durability gets harder to get right
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Can Applications Recover from fsync Failures?
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 How does data reach the disk?
- 3 fsync is really important
- 4 It's hard to get durability correct Applications find it difficult
- 5 fsync can fail Durability gets harder to get right
- 6 Why care about fsync failures? "About a year ago the PostgreSQL community discovered that fsync (on Linux and some BSD systems) may not work the way we always thought it is [sic], with possibly disas…
- 7 Our work Systematically understand fsync failures
- 8 File System Results
- 9 Application Results
- 10 Outline
- 11 File System | Methodology: Fault Injection
- 12 File System Methodology: Workloads Common write patterns in applications • Reduced to simplest form
- 13 File System Result #1: Clean Pages Dirty page is marked clean after fsync failure on all three file systems
- 14 File System Result #22: Page Content File systems do not handle fsync errors uniformly • Page content depends on file system
- 15 File System Result #3: In-memory state In-memory data structures are not entirely reverted
- 16 Applications Five widely used applications
- 17 Applications Results: Overview Ext4 Ordered Mode
- 18 Crash/Restart Simple strategies fail Crash/restart is incorrect recovers wrong data from page cache • Example: PostgreSQL
- 19 Applications Results #1: False Failures False Failures: Indicate failure but actually succeed
- 20 Late Error Reporting All applications susceptible to data loss on ext4 data mode
- 21 Btrfs winning?
- 22 Applications Results Summary Simple strategies fail • Applications have moved away from retries
- 23 Challenges and Directions