DeepAI AI Chat
Log In Sign Up

Investigating Power Outage Effects on Reliability of Solid-State Drives

04/29/2018
by   Saba Ahmadian, et al.
Sharif Accelerator
0

Solid-State Drives (SSDs) are recently employed in enterprise servers and high-end storage systems in order to enhance performance of storage subsystem. Although employing high speed SSDs in the storage subsystems can significantly improve system performance, it comes with significant reliability threat for write operations upon power failures. In this paper, we present a comprehensive analysis investigating the impact of workload dependent parameters on the reliability of SSDs under power failure for variety of SSDs (from top manufacturers). To this end, we first develop a platform to perform two important features required for study: a) a realistic fault injection into the SSD in the computing systems and b) data loss detection mechanism on the SSD upon power failure. In the proposed physical fault injection platform, SSDs experience a real discharge phase of Power Supply Unit (PSU) that occurs during power failure in data centers which was neglected in previous studies. The impact of workload dependent parameters such as workload Working Set Size (WSS), request size, request type, access pattern, and sequence of accesses on the failure of SSDs is carefully studied in the presence of realistic power failures. Experimental results over thousands number of fault injections show that data loss occurs even after completion of the request (up to 700ms) where the failure rate is influenced by the type, size, access pattern, and sequence of IO accesses while other parameters such as workload WSS has no impact on the failure of SSDs.

READ FULL TEXT

page 3

page 4

12/01/2019

Evaluating Reliability of SSD-Based I/O Caches in Enterprise Storage Systems

In this paper, we present a comprehensive analysis investigating the rel...
07/01/2019

Understanding Fault Scenarios and Impacts through Fault Injection Experiments in Cielo

We present a set of fault injection experiments performed on the ACES (L...
10/17/2022

Fault Injection based Failure Analysis of CentOS, Anolis OS and OpenEuler

The reliability of operating system (OS) has always been a major concern...
12/23/2021

A Modeling Framework for Reliability of Erasure Codes in SSD Arrays

To help reliability of SSD arrays, Redundant Array of Independent Disks ...
12/22/2020

The Life and Death of SSDs and HDDs: Similarities, Differences, and Prediction Models

Data center downtime typically centers around IT equipment failure. Stor...
09/30/2020

Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems

Cloud computing systems fail in complex and unexpected ways due to unexp...
11/27/2019

A Utilization Model for Optimization of Checkpoint Intervals in Distributed Stream Processing Systems

State-of-the-art distributed stream processing systems such as Apache Fl...