gurubhandari / APPT

[TSE 2024] APPT: Boosting Automated Patch Correctness Prediction via Fine-tuning Pre-trained Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Boosting Automated Patch Correctness Prediction via Pre-Trained Language Model

We provide the trained model for testing or the necessary scripts and data for training.

Dataset

  • The small dataset is in the folder "dataset/Small"

  • The large dataset is in the folder "dataset/large"

We have also uploaded the dataset to google drive. You can download it here

Environment

  • python 3.7
  • numpy==1.24.3
  • pandas==2.0.1
  • scikit_learn==1.2.2
  • torch==2.0.0+cu117
  • transformers==4.28.1

Model

  • You can download the model directly through this link for testing, or you can use the data given above to train and test yourself.

Train & Testing

  • First of all, please modify the code/configs.py, this file has some parameters needed to train our model.

  • After modifying the parameters in the configs.py for the corresponding RQ, you can run the train.py or test.py to reproduce the corresponding parameters.

    • Training
      python train.py
    
    • Predcition
      python test.py
    
  • Note that you first need to modify the storage path of your model, which is the self.model_save_path

Experiments

  • We integrated all the RQs in the training script, and just changed some parameters for different experiments, listed as follows.
  • We also provide experimental results in our paper, which can be downloaded using the link. Because the model data is too large, we do not give the model results of all experiments, but only the training model of the first data set in each cross validation.

RQ1

Change the path of the dataset self.data_train_path to the corresponding dataset

RQ2

RQ2.1

For APPT_pre-training, please set self.no_pretrain to True For APPT_fine-tuneing, please set self.freeze_bert to True For APPT_LSTM, please, please set self.no_lstm to True

RQ2.2

Replace self.splicingMethod with cat, add, sub, mul, mix according to the category

RQ2.3

Replace self.model_path with 'bert-base-uncased', 'microsoft/codebert-base', 'microsoft/graphcodebert-base' according to the category

RQ3

Set the self.run_rq3 to True and then align with RQ1

About

[TSE 2024] APPT: Boosting Automated Patch Correctness Prediction via Fine-tuning Pre-trained Models


Languages

Language:Python 100.0%