question about capsule softmax weights
michael0905 opened this issue · comments
Yiyuan Liu commented
capsule_weight = tf.stop_gradient(tf.zeros([get_shape(item_his_emb)[0], self.num_interest, self.seq_len]))
capsule_softmax_weight = tf.nn.softmax(capsule_weight, axis=1)
capsule_softmax_weight = tf.where(tf.equal(atten_mask, 0), paddings, capsule_softmax_weight)
capsule_softmax_weight = tf.expand_dims(capsule_softmax_weight, 2)
why softmax is performed on axis 1, which represents different user interest? According to the paper, The item embeddings of the user sequence can be viewed as primary capsules, then the weights of capsules in next layer should sum to 1. So I think softmax should be perfomed on the user sequence. The code should be capsule_softmax_weight = tf.nn.softmax(capsule_weight, axis=2)
Yiyuan Liu commented
Sorry, I misunderstood. The weights of different interest capsules should be sum to 1, rather than the primary capsules. I mixed with self attentive method.