Language model implimented in pytorch
- pytorch
- train data
- vocabulary file
See this repo
format: Tokenized by nltk.tokenized.sent_tokenize
Example:
Tokenized Text .
Seconed Text .
Fromat: {word} {frequency}
mkdir ckpt runs
python train.py --num_iters 100 --store_summary --data_path 'path/to/data*' --vocab_file 'path/to/vocab'