google-deepmind / dnc

A TensorFlow implementation of the Differentiable Neural Computer.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

write to memory with multiple write heads

jingweiz opened this issue · comments

Hey,
So when there're multiple write heads, when writing to memory with these variables:

write_weights: [batch_size x num_write_heads x memory_size]
erase_vectors: [batch_size x num_write_heads x word_size]
write_vectors: [batsh_size x num_write_heads x word_size]
memory: [batch_size x memory_size x word_size]

the erase operation is by:

erase_gate = 
write_weights {reshape to: [batch_size x num_write_heads x memory_size x 1]} 
x 
erase_vectors {reshape to: [batch_size x num_write_heads x 1 x word_size]}
= shape: [batch_size x num_write_heads x memory_size x word_size]

then the 2nd dim is reduced by taking a product over this dimension.
While for the write operation following this erase, this 2nd dimention is reduced directly by the matmul:

add_matrix = 
write_weights {reshape to: [batch_size x memory_size x num_write_heads]} 
x 
write_vectors {reshape to: [batch_size x num_write_heads x word_size]}
= shape: [batch_size x memory_size x word_size]

Is this correct? Cos I didn't get this part from the paper and want to make sure I get it right. Thanks in advance!

This is intended, the reduction is a product for the multiplicative erase and a summation for the additive write. In the paper only one write head was used, but this implementation is more general to facilitate people playing with more write heads for other applications where this might be crucial.

Thanks a lot! That's really helpful!