Checking Patch Behaviour against Test Specification

by   Haoye Tian, et al.

Towards predicting patch correctness in APR, we propose a simple, but novel hypothesis on how the link between the patch behaviour and failing test specifications can be drawn: similar failing test cases should require similar patches. We then propose BATS, an unsupervised learning-based system to predict patch correctness by checking patch Behaviour Against failing Test Specification. BATS exploits deep representation learning models for code and patches: for a given failing test case, the yielded embedding is used to compute similarity metrics in the search for historical similar test cases in order to identify the associated applied patches, which are then used as a proxy for assessing generated patch correctness. Experimentally, we first validate our hypothesis by assessing whether ground-truth developer patches cluster together in the same way that their associated failing test cases are clustered. Then, after collecting a large dataset of 1278 plausible patches (written by developers or generated by some 32 APR tools), we use BATS to predict correctness: BATS achieves an AUC between 0.557 to 0.718 and a recall between 0.562 and 0.854 in identifying correct patches. Compared against previous work, we demonstrate that our approach outperforms state-of-the-art performance in patch correctness prediction, without the need for large labeled patch datasets in contrast with prior machine learning-based approaches. While BATS is constrained by the availability of similar test cases, we show that it can still be complementary to existing approaches: used in conjunction with a recent approach implementing supervised learning, BATS improves the overall recall in detecting correct patches. We finally show that BATS can be complementary to the state-of-the-art PATCH-SIM dynamic approach of identifying the correct patches for APR tools.


Test-based Patch Clustering for Automatically-Generated Patches Assessment

Previous studies have shown that Automated Program Repair (APR) techniqu...

Attention: Not Just Another Dataset for Patch-Correctness Checking

Automated Program Repair (APR) techniques have drawn wide attention from...

Patch Quality and Diversity of Invariant-Guided Search-Based Program Repair

Most automatic program repair techniques rely on test cases to specify c...

Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair

A large body of the literature of automated program repair develops appr...

Is this Change the Answer to that Problem? Correlating Descriptions of Bug and Code Changes for Evaluating Patch Correctness

In this work, we propose a novel perspective to the problem of patch cor...

Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic Reasoning

In this paper, we propose a novel technique, namely INVALIDATOR, to auto...

On Reliability of Patch Correctness Assessment

Current state-of-the-art automatic software repair (ASR) techniques rely...

Please sign up or login with your details

Forgot password? Click here to reset