pytorch-nlp

PyTorch Implementation for Natural Langauge Processing. There are several paper implementation about these tasks.

Classification

charcnn: Character-level Convolutional Networks for Text Classification blog
deepcnn: Very Deep Convolutional Networks for Text Classification blog
lstmcnn: Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers blog

How to train model

If you run code like this, then the Amazon Review dataset will be downloaded and the model you choose will be trained. It took about 1 day to train these models.

cd classification
python train.py --name 'name of logs' --model 'with model to run' --gpu 'which gpu to use'

Training result

charcnn
- number of parameters: 11,339,013
- batch time: 0.251s (512 batch)
- accuracy: 60.30 %
deepcnn
- number of parameters: 16,444,005 개
- batch time: 0.138s (128 batch)
- accuracy: 62.85 %
lstmcnn
- number of parameters: 501381
- batch time: 0.353s (512 batch)
- accuracy: 59.61 %

Language Model

Character-Aware Neural Language Models: paper
reference
- https://github.com/FengZiYjun/CharLM
- https://github.com/pytorch/examples/blob/master/word_language_model

How to train model

If you run code like this, then the Penn Treebank dataset will be downloaded and the model you choose will be trained. It took about 30 minute to train the models.

cd language_model
python train.py --name 'name of logs' --gpu 'which gpu to use'

How to test model

cd language_model
python test.py --model_path 'path to trained model path' --gpu 'which gpu to use'

Training result

character-aware neural language model
- number of parameters: 5,312,485
- batch time: 0.031 (20 batch)
- perplexity on test dataset: 89.850 (paper: 92.3)

About

PyTorch Implementation for Natural Langauge Processing

MIT License

Languages

Language:Python 100.0%