THUNLP-AIPoet / StylisticPoetry

Codes for Stylistic Chinese Poetry Generation via Unsupervised Style Disentanglement (EMNLP 2018)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Various questions about the code, and how one can run it

fadi16 opened this issue · comments

Many thanks for making the code of the paper available.

I'm trying to understand how the code works, but finding it very hard to get the code to run (and generate something). Could you please clarify the following:

  1. What does beam_size mean? I get asked to provide a beam_size when I run generate.py, but not sure what that should be?
  2. What does the file "DubplicateCheckLib.txt" do? It's missing in the repository, so I'm using "DubplicateCheckLib_example.txt" instead, but not sure of the effect that would have on the model.
  3. I trained the model for about an hour with training/testing/validation pkl files I created from the poems example corpus (poems.txt) - (I understand that amount of time with that corpus size is not enough, I was just trying to see if the generation process works). Then I tried to generate something with it (by running generate.py). I tried beam_size (from 1-10) and gave an input sentence ("戴花三朵镇长春"), but I kept getting this error:
    "generation failed! line 2 generation failed!", is this because the model wasn't trained enough? or am I doing something wrong.

Thanks a lot for your help!

commented

hello, mate. any progress ?

Sry for the late response. It has been years after I wrote this code and there's a chance that I missed some details.

  1. beam size means the number of candidates in a beam search. You can simply set it to 20 or 30.
  2. It is used to avoid generating sentences that already in the training set. It's ok to leave it blank.
  3. All sentences are filtered out (e.g. rhythm issue) or not qualified enough. I think 1000 epochs can generate some meaningful sentences. Increasing beam size may also help.