Discerning Legitimate Failures From False Alerts: A Study of Chromium's Continuous Integration

11/05/2021
by   Guillaume Haben, et al.
0

Flakiness is a major concern in Software testing. Flaky tests pass and fail for the same version of a program and mislead developers who spend time and resources investigating test failures only to discover that they are false alerts. In practice, the defacto approach to address this concern is to rerun failing tests hoping that they would pass and manifest as false alerts. Nonetheless, completely filtering out false alerts may require a disproportionate number of reruns, and thus incurs important costs both computation and time-wise. As an alternative to reruns, we propose Fair, a novel, lightweight approach that classifies test failures into false alerts and legitimate failures. Fair relies on a classifier and a set of features from the failures and test artefacts. To build and evaluate our machine learning classifier, we use the continuous integration of the Chromium project. In particular, we collect the properties and artefacts of more than 1 million test failures from 2,000 builds. Our results show that Fair can accurately distinguish legitimate failures from false alerts, with an MCC up to 95 Moreover, by studying different test categories: GUI, integration and unit tests, we show that Fair classifies failures accurately even when the number of failures is limited. Finally, we compare the costs of our approach to reruns and show that Fair could save up to 20 minutes of computation time per build.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2023

The Importance of Discerning Flaky from Fault-triggering Test Failures: A Case Study on the Chromium CI

Flaky tests are tests that pass and fail on different executions of the ...
research
02/14/2022

Gamekins: Gamifying Software Testing in Jenkins

Developers have to write thorough tests for their software in order to f...
research
03/10/2019

Does Unit-Tested Code Crash? A Case Study of Eclipse

Context: Software development projects increasingly adopt unit testing a...
research
08/31/2022

Predicting Flaky Tests Categories using Few-Shot Learning

Flaky tests are tests that yield different outcomes when run on the same...
research
02/14/2020

Lightweight Lexical Test Prioritization for Immediate Feedback

The practice of unit testing enables programmers to obtain automated fee...
research
05/11/2018

Statically Verifying Continuous Integration Configurations

Continuous Integration (CI) testing is a popular software development te...
research
08/26/2021

On the use of test smells for prediction of flaky tests

Regression testing is an important phase to deliver software with qualit...

Please sign up or login with your details

Forgot password? Click here to reset