Runtime reliability monitoring for complex fault-tolerance policies

08/25/2022
by   Alessandro Fantechi, et al.
0

Reliability of complex Cyber-Physical Systems is necessary to guarantee availability and/or safety of the provided services. Diverse and complex fault tolerance policies are adopted to enhance reliability, that include a varied mix of redundancy and dynamic reconfiguration to address hardware reliability, as well as specific software reliability techniques like diversity or software rejuvenation. These complex policies call for flexible runtime health checks of system executions that go beyond conventional runtime monitoring of pre-programmed health conditions, also in order to minimize maintenance costs. Defining a suitable monitoring model in the application of this method in complex systems is still a challenge. In this paper we propose a novel approach, Reliability Based Monitoring (RBM), for a flexible runtime monitoring of reliability in complex systems, that exploits a hierarchical reliability model periodically applied to runtime diagnostics data: this allows to dynamically plan maintenance activities aimed at prevent failures. As a proof of concept, we show how to apply RBM to a 2oo3 software system implementing different fault-tolerant policies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/08/2022

Reliability of fault-tolerant system architectures for automated driving systems

Automated driving functions at high levels of autonomy operate without d...
research
04/19/2022

STPA-driven Multilevel Runtime Monitoring for In-time Hazard Detection

Runtime verification or runtime monitoring equips safety-critical cyber-...
research
11/18/2020

Prognostic and Health Management (PHM) tool for Robot Operating System (ROS)

Nowadays, prognostics-aware systems are increasingly used in many system...
research
05/12/2014

Heterogeneity-aware Fault Tolerance using a Self-Organizing Runtime System

Due to the diversity and implicit redundancy in terms of processing unit...
research
01/12/2018

Efficient Probabilistic Model Checking of Smart Building Maintenance using Fault Maintenance Trees

Cyber-physical systems, like Smart Buildings and power plants, have to m...
research
09/30/2020

Computational framework for real-time diagnostics and prognostics of aircraft actuation systems

Prognostics and Health Management (PHM) are emerging approaches to produ...
research
10/13/2021

Detection Software Content Failures Using Dynamic Execution Information

Modern software systems become too complex to be tested and validated. D...

Please sign up or login with your details

Forgot password? Click here to reset