GXimingLu / a_star_neurologic

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unreasonably long runtime for unsupervised constrained text generation

yui-ishihara opened this issue · comments

I am trying to run the decode.sh in the commongen_unsupervised/ folder:

bash decode.sh 0 1 output.txt

where I am using my number 0 GPU with FACTOR=1 (I believe this is not used in decode.sh) and divert the generation result to output.txt.

However, the runtime is unreasonably long:

 4%|█▉                                                | 37/971 [5:55:23<123:22:57, 475.56s/it]

Is this expected? Is the set of hyperparameters used in your experiment same as indicated in the decode.sh script (this detail is not included in your paper)?

CUDA_VISIBLE_DEVICES=${DEVICES} python decode_gpt2.py --model_name 'gpt2-large' \
  --output_file ${OUTPUT_FILE} \
  --constraint_file ${DATA_DIR}/constraint/${SPLIT}.constraint.json \
  --key_constraint_file ${DATA_DIR}/constraint/${SPLIT}_key.constraint.json \
  --batch_size 16 --beam_size 20 --max_tgt_length 32 --min_tgt_length 5 \
  --ngram_size 3 --length_penalty 0.2 \
  --prune_factor 500000 --sat_tolerance 2 \
  --look_ahead_step 5  --alpha 0.175 --look_ahead_width 1

Unfortunately this is as expected, A* is really slow especially for unsupervised case. When we ran this experiment, we split input file into 8 portions and used 8 GPUs in parallel.