E-APR: Mapping the Effectiveness of Automated Program Repair
Automated Program Repair (APR) is a fast growing area with many new techniques being developed to tackle one of the most challenging software engineering problems. APR techniques have shown promising results, giving us hope that one day it will be possible for software to repair itself. Existing techniques, however, are only effective at repairing certain kinds of bugs. For example, prior studies have shown that the effectiveness of APR techniques is correlated with bug complexity, with most techniques producing patches for easy bugs. This is a useful explanation that can help researchers improve APR techniques. In this paper, we extend these explanations towards a more granular level, with the aim of assessing the strengths and weaknesses of existing APR techniques. To this end, we introduce e-APR, which is a new framework for explaining the effectiveness of APR techniques. E-APR takes as input a set of buggy programs, their features and a set of APR techniques, and generates the footprints of APR techniques, i.e., the regions of the instance space of buggy programs in which good performance is expected from each technique. In this paper, we consider features of the whole program, such as the number of methods and the depth of the inheritance tree, and more specific features of the buggy part of the program, such as the number of Boolean operators in an if statement. The e-APR framework performs machine learning and dimensionality reduction over the feature space to identify the most significant features that have an impact on the effectiveness of APR. The footprints of APR techniques are presented both in a visual and numerical way, which enables us to determine their strengths and weaknesses and how different the APR techniques are from each-other. Finally, e-APR could be integrated to repair infrastructures and repair bots to choose, given a buggy program, the most suitable APR tool.
READ FULL TEXT