Here we present the implementation of a novel RNN model--RUM--which outperforms significantly the current frontier of models in a variety of sequential tasks. The Rotational Unit combines the memorization advantage of unitary/orthogonal matrices with the dynamic structure of associative memory.
If you find this work useful, please cite [arXiv:1710.09537] (https://arxiv.org/pdf/1710.09537.pdf). The model and tasks are described in the same paper.
- Here we implement the operation
Rotation
and theRUM
model. If you want to use the efficient implementation O(N_b * N_h) ofRotation
then importrotate
. If you want to produce a rotation matrix between two vectors, importrotation_operator
. A simple script to test the Rotation functionality is inrotation_test.py
. - If you want to use the
lambda = 0 RUM
then importRUMCell
. Likewise, importARUMCell
for thelambda = 1 RUM
model. - On top of the model we use regularization techniques via
auxiliary.py
(modified from [1]).
Please inspect the arguments in the code in order to test your own hyper-parameters. For example, if you want to run lambda = 1 RUM
with time normalization eta = 2.0
on this task, enter python copying_task ARUM -norm 2.0
.
Please inspect the arguments in the code in order to test your own hyper-parameters. For example, if you want to run lambda = 1 RUM
with time normalization eta = 0.1
on this task, enter python recall_task ARUM -norm 0.1
.
Please inspect the arguments in the code in order to test your own hyper-parameters. For example, if you want to run lambda = 0 RUM
without time normalization on subtask number 5, enter python copying_task RUM 5
.
Please inspect the arguments in the code and the models in ptb_configs.py
in order to conduct your own grid search. For example, if you want to run the model FS-RUM-2
, which achieves 1.189 BPC, enter python ptb_task --model ptb_fs_rum
. The code is adapted from [1], from where we also use a layer-normalized LSTM LNLSTM.py
and the FSRNN higher-level model FSRNN.py
.