Diego999 / MTM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about multiply word embedding with target distribution

deweihu96 opened this issue · comments

Hi Diego,

In section 3, you mentioned that word embeddings multiplied by each target distribution. It's a little ambiguous. Do you mean the matrix-matrix product?

For example, if there are L words in input and embedding size is D, the input shape is [L,D], and let's say the masking matrix's shape is [L,T]. Are you going to multiple each row in input with each column in masking matrix? Then the output shape will be [D,T]. Thanks in advance: )

-Best,
Dewei

Hi,

It is basically a scalar time a vector. For example, if you have an input shape [L, D] and, and a mask [L, T], you should do T-1 times [L, D] * [L, 1]. Then you will have T-1 times [L, D] that you use in the separate classifiers.

Hi,

It is basically a scalar time a vector. For example, if you have an input shape [L, D] and, and a mask [L, T], you should do T-1 times [L, D] * [L, 1]. Then you will have T-1 times [L, D] that you use in the separate classifiers.

Thanks for your reply, Diego, That makes sense!