NTDXYG / ComFormer

code and data for paper "ComFormer: Code Comment Generation via Transformer and Fusion Method-based Hybrid Code Representation" accepted in DSA2021

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

training performance issue

mlant opened this issue Β· comments

commented

Hi,
Thanks you very much for your work.
I'm using your model and I'm having trouble getting a pretrained model from scratch as good as yours.
Here are the best results I got :

  • BLEU : 0.20907722516675964
  • ROUGE-L : 0.19650848830796055
  • METEOR : 0.3384904110878787
    Could you tell what set of parameters you used?
    Thanks,

It looks like you have a strange result. Generally speaking, the METEOR metric should have the lowest score and the BLEU and Rouge-L scores should be relatively high. I double-checked and you got a METEOR that was already similar to the one in my paper.

commented

It was indeed strange for the METEOR.
Look for example in another training, I've got those results :
BLEU : 0.16899427354084465
ROUGE-L : 0.15780111135381264
METEOR : 0.10913556588489155
I had the early stopping = 6, lr=5e-4 and it stopped at 17th step because of the early stopping
I don't know why the model stop improving so early

I would like to confirm if your dataset is consistent with the one used in my paper and the way it was preprocessed? The parameters I used are the ones in train.py

And make sure you use the nlg-eval to evaluation.

commented

I used RQ1 deepCom dataset from their git.
I also used nlg-eval for evaluation.
What kind of preprocessing did you make?

code_seq, sbt = utils.transformer(code)
input_text = ' '.join(code_seq.split()[:256]) + ' '.join(sbt.split()[:256])
and "max_length" in train.py is 512

commented

Do you mean I need to that before executing train_tokenizer.py and train.py.
In use;py for generating the comment I see where this preprocessing is done however, for the training, I don't know where this kind of preprocess is done

hi, how you deal with the syntax error in the original dataset?
I use code_seq, sbt = utils.transformer(code) to preprocess Hybrid-DeepCom RQ1, but met:

Traceback (most recent call last):
  File "preprocess.py", line 19, in <module>
    code_seq, sbt = transformer(row[0])
  File "/root/ComFormer/utils.py", line 180, in transformer
    ast = get_ast(processed_code)
  File "/root/ComFormer/utils.py", line 41, in get_ast
    for path, node in tree:
UnboundLocalError: local variable 'tree' referenced before assignment

by checking the data, I found it is caused by the syntax error of the code in dataset
so did you make it empty or just fix it in dataset?

thanks πŸ˜„

it helps , thanks a lot ! ✨✨✨

commented

yes, before training BPE tokenize, you need to use this method to preprocess the corpus, just like i said in my paper.

Thanks, I forgot to do so.
I run another training with the preprocessed data and the result are worse than before, I've got :
image
training_progress_scores.csv

I'm still in the first epoch and the loss is increasing a lot. I removed the early stopping because the model was stopping at 4 steps.

Do you have an idea about what's wrong?

I don't know why now, you can start by fine-tuning my pre-trained model on the target dataset.
I will retrain it from scratch once in the next few weeks to see.
It may also have something to do with the dataset, the DeepCom dataset seems to have changed before.

Maybe I know what your problem is.
Two of the parameters in 'train.py' were wrong and have been corrected.
In the meantime, I downloaded the latest version of DeepCom dataset and I will post the related preprocessing code later.
And I am also working on training from scratch and continuing training on the 'NTUYG/ComFormer' model in parallel.
Updates to the model will follow.

The training is done. The results I have uploaded are true.csv and predict.csv.
The new version of the model is in the process of uploading. The results are as follows.
Bleu_1: 0.564457
Bleu_2: 0.521086
Bleu_3: 0.488375
Bleu_4: 0.461608
METEOR: 0.411969
ROUGE_L: 0.595989
CIDEr: 4.095886

Hello.
Could you tell me BLUE score?

  • nltk.translate.bleu

you can run BLEU with ''predict.csv'' and ''test.token.nl''(it can be download in the current DeepCom dataset)

Hello :)
I run the test code as you told me.
I wanted to know the blue score, not the blue score 1, 2, 3, and 4, so I checked the test.token.nl & predict.csv and I got about 20 BLUE score.
Considering the blue 1,2,3,4 you wrote, I expected the blue score to be around 50.
I want to compare the blue score with the one you ran.
Could you tell me?
Thank you

Please give me the code you used to compute BLEU score

Oh Sorry @NTDXYG
The code are like this.
image
There was a problem with the code.
After the question, I ran it again and found that the BLEU score is around 56.
Thank you for your quick answer.

Okay, that's fine. If you still have questions and difficulties with my code, please do let me know. This helps me to improve the code of ComFormer.

By the way, if you want to communicate with me in a timely manner, you can send an email to 744621980@qq.com or wechat directly to me at 744621980.

Thank you πŸ˜„