Memory Vulnerability: A Case for Delaying Error Reporting

10/15/2018
by   Luc Jaulmes, et al.
0

To face future reliability challenges, it is necessary to quantify the risk of error in any part of a computing system. To this goal, the Architectural Vulnerability Factor (AVF) has long been used for chips. However, this metric is used for offline characterisation, which is inappropriate for memory. We survey the literature and formalise one of the metrics used, the Memory Vulnerability Factor, and extend it to take into account false errors. These are reported errors which would have no impact on the program if they were ignored. We measure the False Error Aware MVF (FEA) and related metrics precisely in a cycle-accurate simulator, and compare them with the effects of injecting faults in a program's data, in native parallel runs. Our findings show that MVF and FEA are the only two metrics that are safe to use at runtime, as they both consistently give an upper bound on the probability of incorrect program outcome. FEA gives a tighter bound than MVF, and is the metric that correlates best with the incorrect outcome probability of all considered metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2022

Hardware faults that matter: Understanding and Estimating the safety impact of hardware faults on object detection DNNs

Object detection neural network models need to perform reliably in highl...
research
03/22/2023

A Cycle-Accurate Soft Error Vulnerability Analysis Framework for FPGA-based Designs

Many aerospace and automotive applications use FPGAs in their designs du...
research
11/12/2019

MCPA: Program Analysis as Machine Learning

Static program analysis today takes an analytical approach which is quit...
research
01/02/2022

Visilence: An Interactive Visualization Tool for Error Resilience Analysis

Soft errors have become one of the major concerns for HPC applications, ...
research
05/08/2018

Read Disturb Errors in MLC NAND Flash Memory

This paper summarizes our work on experimentally characterizing, mitigat...
research
12/13/2021

Public Release and Validation of SPEC CPU2017 PinPoints

Phase-based statistical sampling methods such as SimPoints have proven t...
research
09/11/2020

Quantifying Membership Inference Vulnerability via Generalization Gap and Other Model Metrics

We demonstrate how a target model's generalization gap leads directly to...

Please sign up or login with your details

Forgot password? Click here to reset