meganndare / cantonese-nlp

cantonese-mandarin unsupervised neural translation for sw project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to generate vocab

opened this issue · comments

Hello, when I use the generate_vocabulary.sh to generate vocab of your dataset, it seems the number of words is larger than you report in your paper. How to get the same vocabulary as your paper described?