Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Xiaomi's HDFS data governance practices and evolution in this 25-minute conference talk by Wang Chengwei, a senior software development engineer at Xiaomi and HDFS contributor. Gain insights into the background of HDFS data governance at Xiaomi, learn about their data governance practices based on hierarchical storage of hot and cold data, and discover their future plans. Delve into the definition and analysis of cold data, understand the Tiering v1 solution using fuse to mount S3 storage as an Archive disk, and examine the Tiering v2 solution for accessing public cloud S3 by modifying HDFS. Discover the architecture, principles, advantages, and challenges of each approach. Conclude with a summary of implementation results and future plans, including support for direct HDFS write to S3 and appending cold data in S3.