OpenNMT / OpenNMT

Open Source Neural Machine Translation in Torch (deprecated)

Home Page:https://opennmt.net/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to speed up when preprocess the corpus ?

stevenyslins opened this issue · comments

Dear all,

According to the issue, I follow this tutorial to train wmt15.

When I perform preprocess the corpus, and set preprocess_pthreads as 10, but it doesn't seem to speed up as I expected.
How could I speed up this step ?
Thank you for your help.

th preprocess.lua -train_src ../wmt15-de-en/wmt15-all-de-en.en.tok -train_tgt ../wmt15-de-en/wmt15-all-de-en.de.tok -valid_src ../wmt15-de-en/newstest2013.en.tok -valid_tgt ../wmt15-de-en/newstest2013.de.tok -save_data ../wmt15-de-en/wmt15-all-en-de -preprocess_pthreads 10

<p.s.> My CPU has 40 thread
system_monitor

Hi all,
You can following here from OpenNMT Forum.
Thank you.