dopper / rotational-unit-of-memory

RUM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rotational-unit-of-memory

#copy
This is ${user.toUpperCase()}'s favorite color: ~~~Favorite Color~~~

Rotational Unit of Memory Official Repo. Implementations and test for RUM both in TensorFlow and PyTorch.

Official website: http://super-ms.mit.edu/rum.html

If this work is useful to you, please cite: https://www.mitpressjournals.org/doi/full/10.1162/tacl_a_00258

Previous work: https://github.com/jingli9111/RUM-Tensorflow

Paper will be presented at NAACL-HLT 2019: https://naacl2019.org

Usage

RUM is used as a drop-in replacement of the standard RNN cells, GRUs, LSTMs, etc. To use it in your research you need two lines of code:

from RUM import RUMCell
rnn_cell = RUMCell(hidden_size)

For the PyTorch implementation, the two lines of code are equivalently (note that the convention is batch_first=True):

from pytorch.rum_model import RUM
rnn_cell = RUM(input_size, hidden_size)

You can further inspect the arguments of RUMCell that account for its tunability:

hidden_size: number of neurons in hidden state
lambda_: lambda parameter for the associative memory
eta_: eta parameter for the norm for the time normalization
acitvation: activation of the temporary new state
reuse: reuse setting
kernel_initializer: init for kernel
bias_initializer: init for bias
eps: the cutoff for the normalizations
use_zoneout: zoneout, True or False
use_layer_norm: batch normalization, True or False
is_training: marker for the zoneout
update_gate: use update gate, True or False
trainable_rot: use trainable rotation, True or False,
track_angle: keep track of the angle, True or False
visualization: whether to visualize the energy landscape
temp_target: a placeholder to feed in for visualization 
temp_target_bias: a placeholder to feed in for visualization
temp_embed: a placeholder to feed in for visualization

Note that the Rotation operation can be used in context outside of RNNs. For example, to rotate a vector v by a rotation R(v1,v2) encoded by the vectors v1 and v2 you need two lines of code (for the PyTorch implementation the lines are analogous):

from RUM import rotate 
new_v = rotate(v1, v2, v)

You can also play with the rotation_operator and rotation_components functions in RUM.py and their equivalents in pytorch/rum_model.py for an implementation in PyTorch.

Last tested with: TensorFlow 1.13.1, PyTorch 1.0.1.post2 and Python 3.7.3.

Tasks

Please inspect the content of tasks/ for training and inference options. A simple example for training on the copying tasks is:

python tasks/copying/copying_task.py RUM

Additionally, the content of pytorch/ contains the tasks copying_task.py and recall_task.py in PyTorch. To reproduce results from the paper you can run python pytorch/copying_task.py and python pytorch/recall_task.py --lambda_ 1 for example.

The code in tasks/LM/ is based on [1] and tasks/summarization/ is based on [2].

License

This project is licensed under the terms of the MIT license.

References

[1] https://github.com/amujika/Fast-Slow-LSTM

[2] https://github.com/abisee/pointer-generator

About

RUM

License:MIT License


Languages

Language:Python 98.5%Language:Shell 1.5%