soarsmu / LLM4SA4SE

Replication package of TOSEM submission: Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Table of Contents

Installation

To install necessary dependencies, run the following command:

conda env create -f environment.yml

Data

Data path: ./data/

Usage

Running sLLMs:

args:

  • -v: variant, i.e., distilbert, BERT, RoBERTa, XLNet, ALBERT
  • -d: dataset name, i.e., app, code (Gerrit dataset), github, so (StackOverflow dataset), and jira.
python run_slm.py -v distilbert -d app

Running LLMs

args:

  • -d: dataset name, i.e., app, code (Gerrit dataset), github, so (StackOverflow dataset), and jira.
  • -m: model name, i.e. llama2, wizardlm, vicuna
  • -p: prompt template
python run_llm.py -d app -m llama2 -p llama2-1 -s 1

Evaluation

args:

  • -d: dataset name, i.e., app, code (Gerrit dataset), github, so (StackOverflow dataset), and jira.
  • -m: model name, i.e. llama2, wizardlm, vicuna
  • -p: prompt template
  • -s: shots, i.e., 0 for zero-shot, 1 for one-shot, 3 for three-shot, and 5 for five-shot.
python eval.py -d app -m vicuna -p vicuna-0 -s 0

Results

The results are saved in ./results/. For instance, for the results of LLaMA2 on the APP dataset, the directory structure is as follows:

πŸ“¦results
 ┣ πŸ“‚app
 ┃ ┣ πŸ“‚llama2
 ┃ ┃ ┣ πŸ“‚few-shot
 ┃ ┃ ┃ β”— πŸ“‚llama2
 ┃ ┃ ┃ ┃ ┣ πŸ“‚1
 ┃ ┃ ┃ ┃ ┣ πŸ“‚3
 ┃ ┃ ┃ ┃ ┣ πŸ“‚5
 ┃ ┃ β”— πŸ“‚zero-shot
 ┃ ┃ ┃ ┣ πŸ“‚llama2-0
 ┃ ┃ ┃ ┣ πŸ“‚llama2-1
 ┃ ┃ ┃ ┣ πŸ“‚llama2-2

Additional Scripts

  • draw_figures.py: draw figures for the paper.
  • preprocess.py: preprocess the dataset.
  • analyze_errors.py: analyze the errors of the models.
  • sample_few_shot.py: sample few-shot examples from the dataset.

About

Replication package of TOSEM submission: Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models


Languages

Language:Python 100.0%