Realistic Error Injection for System Calls

06/08/2020
by   Long Zhang, et al.
0

In this paper, we present a novel fault injection framework called Phoebe for reliability analysis with respect to system call invocation errors. First, Phoebe enables developers to have full observability of system call invocations. Second, Phoebe generates error models that are realistic in the sense that they resemble errors that naturally happen in production. With the generated error models, Phoebe automatically conducts a series of experiments to systematically assess the reliability of applications with respect to system call invocation errors in production. We evaluate the effectiveness and runtime overhead of Phoebe on two real-world applications in a production environment. The results show that Phoebe successfully generates realistic error models and is able to detect important reliability weaknesses with respect to system call invocation errors. To our knowledge, this novel concept of "realistic error injection", which consists of grounding fault injection on production errors, has never been studied before.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/30/2019

Observability and Chaos Engineering on System Calls for Containerized Applications in Docker

In this paper, we present a novel fault injection system called ChaosOrc...
research
06/04/2022

Fast and Accurate Error Simulation for CNNs against Soft Errors

The great quest for adopting AI-based computation for safety-/mission-cr...
research
05/05/2022

Identifying Cause-and-Effect Relationships of Manufacturing Errors using Sequence-to-Sequence Learning

In car-body production the pre-formed sheet metal parts of the body are ...
research
03/13/2020

Investigating Error Injection to Enhance the Effectiveness of Mobile Text Entry Studies of Error Behaviour

During lab studies of text entry methods it is typical to observer very ...
research
09/21/2022

On the Comparison between the Reliability of Units Produced by Different Production Lines

The paper discusses how to evaluate the reliability of units produced by...
research
10/24/2020

LCFI: A Fault Injection Tool for Studying Lossy Compression Error Propagation in HPC Programs

Error-bounded lossy compression is becoming more and more important to t...
research
12/27/2018

TripleAgent: Monitoring, Perturbation And Failure-obliviousness for Automated Resilience Improvement in Java Applications

In this paper, we present a novel system for fault injection in producti...

Please sign up or login with your details

Forgot password? Click here to reset