Various questions about the code, and how one can run it

Question

Various questions about the code, and how one can run it

fadi16 opened this issue 3 years ago · comments

Many thanks for making the code of the paper available.

I'm trying to understand how the code works, but finding it very hard to get the code to run (and generate something). Could you please clarify the following:

What does beam_size mean? I get asked to provide a beam_size when I run generate.py, but not sure what that should be?
What does the file "DubplicateCheckLib.txt" do? It's missing in the repository, so I'm using "DubplicateCheckLib_example.txt" instead, but not sure of the effect that would have on the model.
I trained the model for about an hour with training/testing/validation pkl files I created from the poems example corpus (poems.txt) - (I understand that amount of time with that corpus size is not enough, I was just trying to see if the generation process works). Then I tried to generate something with it (by running generate.py). I tried beam_size (from 1-10) and gave an input sentence ("戴花三朵镇长春"), but I kept getting this error:
"generation failed! line 2 generation failed!", is this because the model wasn't trained enough? or am I doing something wrong.

Thanks a lot for your help!

yzq · Answer 1 · Wed Oct 27 2021 11:42:52 GMT+0800 (China Standard Time)

hello, mate. any progress ?

yangcheng · Answer 2 · Wed Oct 27 2021 17:35:59 GMT+0800 (China Standard Time)

Sry for the late response. It has been years after I wrote this code and there's a chance that I missed some details.

beam size means the number of candidates in a beam search. You can simply set it to 20 or 30.
It is used to avoid generating sentences that already in the training set. It's ok to leave it blank.
All sentences are filtered out (e.g. rhythm issue) or not qualified enough. I think 1000 epochs can generate some meaningful sentences. Increasing beam size may also help.