question about capsule softmax weights

Question

question about capsule softmax weights

michael0905 opened this issue 4 years ago · comments

capsule_weight = tf.stop_gradient(tf.zeros([get_shape(item_his_emb)[0], self.num_interest, self.seq_len]))

capsule_softmax_weight = tf.nn.softmax(capsule_weight, axis=1)
capsule_softmax_weight = tf.where(tf.equal(atten_mask, 0), paddings, capsule_softmax_weight)
capsule_softmax_weight = tf.expand_dims(capsule_softmax_weight, 2)

why softmax is performed on axis 1, which represents different user interest? According to the paper, The item embeddings of the user sequence can be viewed as primary capsules, then the weights of capsules in next layer should sum to 1. So I think softmax should be perfomed on the user sequence. The code should be capsule_softmax_weight = tf.nn.softmax(capsule_weight, axis=2)

Yiyuan Liu · Answer 1 · Tue Oct 13 2020 20:57:37 GMT+0800 (China Standard Time)

Sorry, I misunderstood. The weights of different interest capsules should be sum to 1, rather than the primary capsules. I mixed with self attentive method.