This repository contains the code used for word-level language model and unsupervised parsing experiments in Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks paper, originally forked from the LSTM and QRNN Language Model Toolkit for PyTorch. If you use this code or our results in your research, we'd appreciate if you cite our paper as following:
@article{shen2018ordered,
title={Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks},
author={Shen, Yikang and Tan, Shawn and Sordoni, Alessandro and Courville, Aaron},
journal={arXiv preprint arXiv:1810.09536},
year={2018}
}
Python 3.6, NLTK and PyTorch 0.4 are required for the current codebase.
-
Install PyTorch 0.4 and NLTK
-
Download PTB data. Note that the two tasks, i.e., language modeling and unsupervised parsing share the same model strucutre but require different formats of the PTB data. For language modeling we need the standard 10,000 word Penn Treebank corpus data, and for parsing we need Penn Treebank Parsed data.
-
Scripts and commands
- Train Language Modeling
python main.py --batch_size 20 --dropout 0.45 --dropouth 0.3 --dropouti 0.5 --wdrop 0.45 --chunk_size 10 --seed 141 --epoch 1000 --data /path/to/your/data
- Test Unsupervised Parsing
python test_phrase_grammar.py --cuda
- Train Language Modeling
The default setting in main.py
achieves a perplexity of approximately 56.17
on PTB test set and unlabeled F1 of approximately 47.7
on WSJ test set.
本项目为论文 Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks 的实验复现项目,通过运行本项目的代码能够对论文中的实验一的结果进行复现。
- 安装有 Docker 24.0.x 的 Host
- Docker Host 需要安装有 Nvidia GPU(显存 > 8Gi)
- Docker Host 需要配置 Nvidia Driver 和 Nvidia Runtime for Docker
如果 Docker Host 是第一次运行本项目,则需要首先使用如下的命令构建用于运行实验代码的 Docker 镜像(只需要运行一次):
make docker
接下来通过如下的命令启动 Docker 容器,该容器中包含了运行实验一代码的进程:
make experiment
上述的命令会启动一个用于运行实验一的容器on-lstm-train
,可以通过如下的命令查看容器是否启动,如果命令的输出不为空则说明容器成功启动:
docker ps | grep on-lstm-train
等待容器运行完成(退出码为0),通过以下的命令查看实验一的代码运行日志:
docker logs on-lstm-train
该代码运行日志中包含了每一轮迭代的Validation perplexiy
和Test perplexity
,并在最后给出最优的Test perplexity
。从日志中可以看出Validation perplexity
会随着迭代次数的增多而下降。本处给出本人在进行实验复现时的日志文件log-example/model_result.txt
。