Skip to content

Feng-Jay/GiantRepair

Repository files navigation

GiantRepair

Artifact for TOSEM Submission: GiantRepair

I. Introduction

Automated Program Repair (APR) has garnered significant attention due to its potential to streamline the bug repair process for human developers. Recently, LLM-based APR methods have shown promise in repairing real-world bugs. However, existing APR methods often utilize patches generated by LLMs without further optimization, resulting in reduced effectiveness due to the lack of program-specific knowledge. Furthermore, the evaluations of these APR methods have typically been conducted under the assumption of perfect fault localization, which may not accurately reflect their real-world effectiveness. To address these limitations, this paper introduces an innovative APR approach called GIANTREPAIR. Our approach leverages the insight that LLM-generated patches, although not necessarily correct, offer valuable guidance for the patch generation process. Based on this insight, GIANTREPAIR first constructs patch skeletons from LLM-generated patches to confine the patch space, and then generates high-quality patches tailored to specific programs through context-aware patch generation by instantiating the skeletons. To evaluate the performance of our approach, we conduct two large-scale experiments. The results demonstrate that GIANTREPAIR not only effectively repairs more bugs (an average of 27.78% on Defects4J v1.2 and 23.40% on Defects4J v2.0) than using LLM-generated patches directly, but also outperforms state-of-the-art APR methods by repairing at least 42 and 7 more bugs under perfect and automated fault localization scenarios, respectively.

The workflow of this GaintRepair.\label{workflow}

II. Project Structure

├── GiantRepair: GiantRepair's Java implementation
├── LLM_Inference: Code to apply LLMs to APR task
│   ├── Models
│   ├── run_apr.py
│   ├── script_runapr.sh
│   ├── test_llm.py
│   └── utils
├── README.md
├── doc
├── results: Specific Results used in paper.
└── d4j-info: Analysisi results of Defects4J and GrowingBugs Dataset
    ├── filelist.json
    ├── growing_bugs_filelist.json
    ├── growing_bugs_single_function.json
    ├── growing_bugs_single_function_expand.json
    ├── linelist.json
    └── single_function_repair.json

III. Environment

GiantRepair

  • OS: Linux (Tested on Ubuntu 20.04.6 LTS)
  • OpenJDK 1.8.0_382 and OpenJDK 11.0.20.1
  • Download and configure Defects4J and ExpressAPR.
  • More runtime configurations can be found in the config-file.

LLM

  • Python==3.9
  • transformers==4.33.3

IV. How to Run

Prepare

  1. Defects4J Setting
defects4j checkout -p Chart -v 1b -w ${buggy_program_path}/chart/chart_1_buggy
  1. ExpressAPR Setting, shown in Link

  2. Modify GiantRepair's setting in configfile

then

java -jar GiantRepair repair -d4j {bugid} -d4jhome {buggy_program_path} -modelname {modelName}

bugid should be like proj_idnum all in lowercase.

V. Ablation Results

In oreder to study the contribution of various components in GIANTREPAIR to the overall performance, we have set up the following three variants:

  1. GiantRepairselection will randomly select code elements from the project to fill the code skeletons, rather than being constrained by syntatic rules.
  2. GiantRepaircontext will test the generated patches in the order of generation, rather than rank by the similarities.
  3. GiantRepairadaptive will randomly select modifications from LLM patches, rather than apply coarse-grained modifications.

We conduct the experiment on Defects4J v1.2 single-function bugs, and the results shows in following table:

variant #Plausible Fixes #Correct Fixes %Precision
GiantRepairselection 123 46 37.40%
GiantRepaircontext 129 51 39.53%
GiantRepairadaptive 125 49 39.20%
GiantRepairori 135 55 40.74%

Thie table shows the numebr of plausible fixes, correct fixes and precision value for each of the three variants. We first observe that just randomly filling code skeletons, we achieve the lowest number of plausible fixes and precision value. And by disable the Context similarity and Adaptive application, these variant also have drop on the number of plausible and correct fixes. As a result, all the components contribute to the overall effectiveness of GiantRepair. GiantRepair can effectively produce more plausible/correct fixes by utilizing LLM-generated patches.

VI. Discussion Results

Experiment with the GPT-4-1106-preview

To investigate whether or not GiantRepair is still effective for repairing unique bugs when comparing to the most advanced LLMs, we conducted another experiment with GPT-4.Specifically, we randomly selected ten bugs that were correctly repaired by GIANTREPAIR but cannot by the studied LLMs, and then invoked GPT-4 via API requests to generate 20 patches for each bug. Here is the outcome table:

Bug ids Closure-19 Closure-36 Closure-113 Lang-57 Math-27 Math-85 Cli-32 Codec-4 Compress-1 Jsoup-33
GPT-4-1106-preview $\checkmark$ $\times$ $\times$ $\times$ $\times$ $\times$ $\times$ $\times$ $\times$ $\times$
GiantRepair $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$ $\checkmark$

Data leakage

In Discussion's Data leakage part, we not only showcase GiantRepair's effectiveness in addressing data leakage concerns by examining the StarCoder training dataset, but we also seek to further substantiate this conclusion. To achieve this, we employed the GrowingBugs dataset for additional experimentation. Remarkably, GiantRepair managed to successfully rectify 10 out of the 51 bugs identified. The detailed data are presented in the tables below:

Project Bugs#SF GiantRepair
Canvas_api 2 1
Dosgi_common 1 1
Hono_client 2 0
Tika_app 1 1
HttpClient5 2 0
JacksonDatatypeJsr310 1 0
JacksonModuleAfterburner 1 1
Switchyard_admin 1 1
Qpidjms_client 1 0
Tiles_api 1 0
Tiles_core 2 0
Wicket_request 5 0
Wicket_util 4 1
Wicket_spring 1 0
Struts1_core 2 0
Wicket_core 10 2
Cargo_container 3 0
Jcodemodel 1 1
Vectorz 2 0
Restfixture 2 0
Xades4j 1 0
Render_app 1 0
Leshan_core 4 1
Total 51 10

About

Artifact for TOSEM Submission: GiantRepair

Resources

Stars

Watchers

Forks

Packages

No packages published