FlaPy: Mining Flaky Python Tests at Scale

05/08/2023
by   Martin Gruber, et al.
0

Flaky tests obstruct software development, and studying and proposing mitigations against them has therefore become an important focus of software engineering research. To conduct sound investigations on test flakiness, it is crucial to have large, diverse, and unbiased datasets of flaky tests. A common method to build such datasets is by rerunning the test suites of selected projects multiple times and checking for tests that produce different outcomes. While using this technique on a single project is mostly straightforward, applying it to a large and diverse set of projects raises several implementation challenges such as (1) isolating the test executions, (2) supporting multiple build mechanisms, (3) achieving feasible run times on large datasets, and (4) analyzing and presenting the test outcomes. To address these challenges we introduce FlaPy, a framework for researchers to mine flaky tests in a given or automatically sampled set of Python projects by rerunning their test suites. FlaPy isolates the test executions using containerization and fresh execution environments to simulate real-world CI conditions and to achieve accurate results. By supporting multiple dependency installation strategies, it promotes diversity among the studied projects. FlaPy supports parallelizing the test executions using SLURM, making it feasible to scan thousands of projects for test flakiness. Finally, FlaPy analyzes the test outcomes to determine which tests are flaky and depicts the results in a concise table. A demo video of FlaPy is available at https://youtu.be/ejy-be-FvDY

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2022

An Empirical Study of Flaky Tests in JavaScript

Flaky tests (tests with non-deterministic outcomes) can be problematic f...
research
03/03/2021

An Empirical Analysis of UI-based Flaky Tests

Flaky tests have gained attention from the research community in recent ...
research
03/01/2022

A Survey on How Test Flakiness Affects Developers and What Support They Need To Address It

Non-deterministically passing and failing test cases, so-called flaky te...
research
01/22/2021

An Empirical Study of Flaky Tests in Python

Tests that cause spurious failures without any code changes, i.e., flaky...
research
11/03/2021

Smells in System User Interactive Tests

Test smells are known as bad development practices that reflect poor des...
research
08/07/2023

Simulating the Software Development Lifecycle: The Waterfall Model

(1) Background: This study employs a simulation-based approach, adapting...
research
03/23/2021

What is the Vocabulary of Flaky Tests? An Extended Replication

Software systems have been continuously evolved and delivered with high ...

Please sign up or login with your details

Forgot password? Click here to reset