DeepAI AI Chat
Log In Sign Up

Black-Box Testing of Deep Neural Networks through Test Case Diversity

by   Zohreh Aghababaeyan, et al.

Deep Neural Networks (DNNs) have been extensively used in many areas including image processing, medical diagnostics, and autonomous driving. However, DNNs can exhibit erroneous behaviours that may lead to critical errors, especially when used in safety-critical systems. Inspired by testing techniques for traditional software systems, researchers have proposed neuron coverage criteria, as an analogy to source code coverage, to guide the testing of DNN models. Despite very active research on DNN coverage, several recent studies have questioned the usefulness of such criteria in guiding DNN testing. Further, from a practical standpoint, these criteria are white-box as they require access to the internals or training data of DNN models, which is in many contexts not feasible or convenient. In this paper, we investigate black-box input diversity metrics as an alternative to white-box coverage criteria. To this end, we first select and adapt three diversity metrics and study, in a controlled manner, their capacity to measure actual diversity in input sets. We then analyse their statistical association with fault detection using two datasets and three DNN models. We further compare diversity with state-of-the-art white-box coverage criteria. Our experiments show that relying on the diversity of image features embedded in test input sets is a more reliable indicator than coverage criteria to effectively guide the testing of DNNs. Indeed, we found that one of our selected black-box diversity metrics far outperforms existing coverage criteria in terms of fault-revealing capability and computational time. Results also confirm the suspicions that state-of-the-art coverage metrics are not adequate to guide the construction of test input sets to detect as many faults as possible with natural inputs.


page 3

page 7


DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks

Deep neural networks (DNNs) are widely used in various application domai...

There is Limited Correlation between Coverage and Robustness for Deep Neural Networks

Deep neural networks (DNN) are increasingly applied in safety-critical s...

You Can't See the Forest for Its Trees: Assessing Deep Neural Network Testing via NeuraL Coverage

This paper summarizes eight design requirements for DNN testing criteria...

White-box Testing of NLP models with Mask Neuron Coverage

Recent literature has seen growing interest in using black-box strategie...

Deep Neural Network Test Coverage: How Far Are We?

DNN testing is one of the most effective methods to guarantee the qualit...

Coverage-Guided Fuzzing for Deep Neural Networks

In company with the data explosion over the past decade, deep neural net...

DeepHunter: Hunting Deep Neural Network Defects via Coverage-Guided Fuzzing

In company with the data explosion over the past decade, deep neural net...