DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons

05/19/2021
by   Dawn Drain, et al.
0

The joint task of bug localization and program repair is an integral part of the software development process. In this work we present DeepDebug, an approach to automated debugging using large, pretrained transformers. We begin by training a bug-creation model on reversed commit data for the purpose of generating synthetic bugs. We apply these synthetic bugs toward two ends. First, we directly train a backtranslation model on all functions from 200K repositories. Next, we focus on 10K repositories for which we can execute tests, and create buggy versions of all functions in those repositories that are covered by passing tests. This provides us with rich debugging information such as stack traces and print statements, which we use to finetune our model which was pretrained on raw source code. Finally, we strengthen all our models by expanding the context window beyond the buggy function itself, and adding a skeleton consisting of that function's parent class, imports, signatures, docstrings, and method bodies, in order of priority. On the QuixBugs benchmark, we increase the total number of fixes found by over 50 the false positive rate from 35 hours to one minute. On our own benchmark of executable tests, our model fixes 68 traces it fixes 75 validation set for evaluating on executable tests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2021

Generating Bug-Fixes Using Pretrained Transformers

Detecting and fixing bugs are two of the most important yet frustrating ...
research
05/26/2021

Self-Supervised Bug Detection and Repair

Machine learning-based program analyses have recently shown the promise ...
research
03/13/2023

InferFix: End-to-End Program Repair with LLMs

Software development life cycle is profoundly influenced by bugs: their ...
research
04/25/2023

TraceFixer: Execution Trace-Driven Program Repair

When debugging unintended program behavior, developers can often identif...
research
04/30/2022

Katana: Dual Slicing-Based Context for Learning Bug Fixes

Contextual information plays a vital role for software developers when u...
research
12/11/2020

WITCHER : Detecting Crash Consistency Bugs in Non-volatile Memory Programs

The advent of non-volatile main memory (NVM) enables the development of ...
research
08/04/2020

Anchor: Locating Android Framework-specific Crashing Faults

Android framework-specific app crashes are hard to debug. Indeed, the ca...

Please sign up or login with your details

Forgot password? Click here to reset