This is the repository for Neural Attention Memory paper experiments.
Clone with --recurse-submodules
to load the SCAN dataset.
- Python 3.8
- CUDA-capable GPU (Tested on RTX 4090 24GB. Reduce
--batch_size
if gpu memory is limited) - PyTorch >= 1.7
- CUDA >= 10 (Install with PyTorch)
- Python libraries listed in requirements.txt
AutoEncode.py
is the code for running the experiments as below.
--log
will create a log file of the experiment.
python AutoEncode.py --net namtm --seq_type add --digits 10 --log
For 4-DYCK, run python DYCK.py
to generate the data points first.
Our program supports multiple command-line options to provide a better user experience. The below table shows major options that can be simply appended when running the program.
Options | Default | Description |
---|---|---|
--net | namtm | Model to run tf: Transformer ut: Universal Transformer dnc: Differentiable Neural Computer lstm: LSTM w attention stm: SAM Two-memory Model namtm: NAM-TM stack: Stack-RNN |
--seq_type | add | task for prediction add: addition task (NSP) reverse: reverse task (NSP) reduce: Sequence reduction task dyck: 4-DYCK task |
--digits | 10 | Max number of training digits |
--log | false | Log training/validation results |
--exp | 0 | Assign log file identifier when --log is true |
See Options.py
or python AutoEncode.py --help
for more options.
Some parts of this repository are from the following open-source projects.
This repository follows the open-source policies of all of them.
- DNC (
dnc/
): https://github.com/RobertCsordas/dnc - Universal Transformer (
transformer_generalization/
): https://github.com/RobertCsordas/transformer_generalization - LSTM seq2seq (
Models.py
): https://github.com/pytorch/fairseq - Number Sequence Prediction dataset (
NSPDataset.py, AutoEncode.py
): https://github.com/hwnam831/numbersequenceprediction - XLNet (
XLNet.py
): https://github.com/huggingface/transformers