relative distance 연산

Question

relative distance 연산

imj2185 opened this issue 4 years ago · comments

Chonghan_Lee commented 4 years ago

안녕하세요.

Music Transformer 페이퍼와 비교하면서 코들를 읽다가 질문이 있어서 올립니다.

페이퍼 섹션 3.4 에 relative distance를 구하여 dot product 연산하는 부분이 있는데 코드에서는

self.E = torch.randn([self.max_seq, int(self.dh)], requires_grad=False)로 distribution을 쓰시더라구요.

이부분은 페이퍼와 다르게 하신건가요?

감사합니다.

Serkan Sulun · Answer 1 · Mon Jun 14 2021 17:42:08 GMT+0800 (China Standard Time)

Bump. Can someone explain the usage of
self.E = torch.randn([self.max_seq, int(self.dh)], requires_grad=False)
while calculating relative attention? Also, this parameter isn't registered so it prevents reproducibility when model is reloaded.