question about freq mask code
iszihan opened this issue · comments
Looking at the frequency mask computation function, I'm a bit confused on this line. Wouldn't this always constraint ptr to be <2 ? How would we be able to mask more frequencies this way?
Line 300 in 2dc3f6d
Pos_enc_length = (L*2+1)*3. So ptr would not always be less than 2.
Maybe you can print the length of the position encoding feature.:)