TEASMA: A Practical Approach for the Test Assessment of Deep Neural Networks using Mutation Analysis

08/02/2023
by   Amin Abbasishahkoo, et al.
0

Successful deployment of Deep Neural Networks (DNNs), particularly in safety-critical systems, requires their validation with an adequate test set to ensure a sufficient degree of confidence in test outcomes. Mutation analysis, one of the main techniques for measuring test adequacy in traditional software, has been adapted to DNNs in recent years. This technique is based on generating mutants that aim to be representative of actual faults and thus can be used for test adequacy assessment. In this paper, we investigate for the first time whether mutation operators that directly modify the trained DNN model (i.e., post-training) can be used for reliably assessing the test inputs of DNNs. We propose and evaluate TEASMA, an approach based on post-training mutation for assessing the adequacy of DNN's test sets. In practice, TEASMA allows engineers to decide whether they will be able to trust test results and thus validate the DNN before its deployment. Based on a DNN model's training set, TEASMA provides a methodology to build accurate prediction models of the Fault Detection Rate (FDR) of a test set from its mutation score, thus enabling its assessment. Our large empirical evaluation, across multiple DNN models, shows that predicted FDR values have a strong linear correlation (R2 >= 0.94) with actual values. Consequently, empirical evidence suggests that TEASMA provides a reliable basis for confidently deciding whether to trust test results or improve the test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2023

Mutation-based Fault Localization of Deep Neural Networks

Deep neural networks (DNNs) are susceptible to bugs, just like other typ...
research
02/03/2020

Supporting DNN Safety Analysis and Retraining through Heatmap-based Unsupervised Learning

Deep neural networks (DNNs) are increasingly critical in modern safety-c...
research
08/13/2019

Learning Credible Deep Neural Networks with Rationale Regularization

Recent explainability related studies have shown that state-of-the-art D...
research
04/03/2023

D-Score: A White-Box Diagnosis Score for CNNs Based on Mutation Operators

Convolutional neural networks (CNNs) have been widely applied in many sa...
research
01/20/2022

DeepGalaxy: Testing Neural Network Verifiers via Two-Dimensional Input Space Exploration

Deep neural networks (DNNs) are widely developed and applied in many are...
research
09/15/2021

DeepMetis: Augmenting a Deep Learning Test Set to Increase its Mutation Score

Deep Learning (DL) components are routinely integrated into software sys...
research
11/01/2022

ActGraph: Prioritization of Test Cases Based on Deep Neural Network Activation Graph

Widespread applications of deep neural networks (DNNs) benefit from DNN ...

Please sign up or login with your details

Forgot password? Click here to reset