Unreasonably long runtime for unsupervised constrained text generation
yui-ishihara opened this issue · comments
Yui Ishihara commented
I am trying to run the decode.sh
in the commongen_unsupervised/
folder:
bash decode.sh 0 1 output.txt
where I am using my number 0 GPU with FACTOR=1
(I believe this is not used in decode.sh
) and divert the generation result to output.txt
.
However, the runtime is unreasonably long:
4%|█▉ | 37/971 [5:55:23<123:22:57, 475.56s/it]
Is this expected? Is the set of hyperparameters used in your experiment same as indicated in the decode.sh
script (this detail is not included in your paper)?
CUDA_VISIBLE_DEVICES=${DEVICES} python decode_gpt2.py --model_name 'gpt2-large' \
--output_file ${OUTPUT_FILE} \
--constraint_file ${DATA_DIR}/constraint/${SPLIT}.constraint.json \
--key_constraint_file ${DATA_DIR}/constraint/${SPLIT}_key.constraint.json \
--batch_size 16 --beam_size 20 --max_tgt_length 32 --min_tgt_length 5 \
--ngram_size 3 --length_penalty 0.2 \
--prune_factor 500000 --sat_tolerance 2 \
--look_ahead_step 5 --alpha 0.175 --look_ahead_width 1
Ximing Lu commented
Unfortunately this is as expected, A* is really slow especially for unsupervised case. When we ran this experiment, we split input file into 8 portions and used 8 GPUs in parallel.