BugDoc: Algorithms to Debug Computational Processes

04/12/2020
by   Raoni Lourenco, et al.
0

Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challenging, usually requiring time and much human thought, while still being error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our experimental data and processing software is available for use, reproducibility, and enhancement.

READ FULL TEXT

page 11

page 12

research
02/11/2020

Debugging Machine Learning Pipelines

Machine learning tasks entail the use of complex computational pipelines...
research
05/03/2022

Automatically Debugging AutoML Pipelines using Maro: ML Automated Remediation Oracle (Extended Version)

Machine learning in practice often involves complex pipelines for data c...
research
01/31/2023

DNN Explanation for Safety Analysis: an Empirical Evaluation of Clustering-based Approaches

The adoption of deep neural networks (DNNs) in safety-critical contexts ...
research
11/05/2021

CloudRCA: A Root Cause Analysis Framework for Cloud Computing Platforms

As business of Alibaba expands across the world among various industries...
research
03/23/2016

Debugging Machine Learning Tasks

Unlike traditional programs (such as operating systems or word processor...
research
04/21/2022

Mining Root Cause Knowledge from Cloud Service Incident Investigations for AIOps

Root Cause Analysis (RCA) of any service-disrupting incident is one of t...
research
05/05/2023

Flock: Accurate network fault localization at scale

Inferring the root cause of failures among thousands of components in a ...

Please sign up or login with your details

Forgot password? Click here to reset