question about time embedding

Question

question about time embedding

shixishi opened this issue 2 years ago · comments

def get_timestep_embedding(timesteps, embedding_dim: int):
"""
From Fairseq.
Build sinusoidal embeddings.
This matches the implementation in tensor2tensor, but differs slightly
from the description in Section 3.5 of "Attention Is All You Need".
"""
assert len(timesteps.shape) == 1 # and timesteps.dtype == tf.int32

half_dim = embedding_dim // 2
emb = math.log(10000) / (half_dim - 1)

I don't understand why (half_dim - 1) is used here. According to the transformer's time-coding formula, there should be "emb = math.log(10000) / half_dim", I don't think half_dim should minus 1 here.