write to memory with multiple write heads

Question

write to memory with multiple write heads

jingweiz opened this issue 7 years ago · comments

Hey,
So when there're multiple write heads, when writing to memory with these variables:

write_weights: [batch_size x num_write_heads x memory_size]
erase_vectors: [batch_size x num_write_heads x word_size]
write_vectors: [batsh_size x num_write_heads x word_size]
memory: [batch_size x memory_size x word_size]

the erase operation is by:

erase_gate = 
write_weights {reshape to: [batch_size x num_write_heads x memory_size x 1]} 
x 
erase_vectors {reshape to: [batch_size x num_write_heads x 1 x word_size]}
= shape: [batch_size x num_write_heads x memory_size x word_size]

then the 2nd dim is reduced by taking a product over this dimension.
While for the write operation following this erase, this 2nd dimention is reduced directly by the matmul:

add_matrix = 
write_weights {reshape to: [batch_size x memory_size x num_write_heads]} 
x 
write_vectors {reshape to: [batch_size x num_write_heads x word_size]}
= shape: [batch_size x memory_size x word_size]

Is this correct? Cos I didn't get this part from the paper and want to make sure I get it right. Thanks in advance!

dm-jrae · Answer 1 · Thu May 04 2017 17:13:13 GMT+0800 (China Standard Time)

This is intended, the reduction is a product for the multiplicative erase and a summation for the additive write. In the paper only one write head was used, but this implementation is more general to facilitate people playing with more write heads for other applications where this might be crucial.

Jingwei Zhang · Answer 2 · Thu May 04 2017 17:53:29 GMT+0800 (China Standard Time)

Thanks a lot! That's really helpful!