questions related to the comparison with other common data augmentation in Table 1

Question

questions related to the comparison with other common data augmentation in Table 1

Shuwen27 opened this issue 9 months ago · comments

First and foremost, thank you for your outstanding work and the fantastic code repository!

I have a question regarding Table 1. When evaluating other common data augmentation techniques like cropping, word deletion, and replacement, did you also involve the use of distinct dropout masks for pairs? In other words, do the reported results correspond to a model combining different dropout masks for pairs with word deletion, for instance? If this isn't the case, I'm curious about the mechanism you employ to ensure consistent usage of the same dropout masks for pairs when applying these typical data augmentation techniques.

Thank you!

Tianyu Gao · Answer 1 · Tue Oct 17 2023 01:26:03 GMT+0800 (China Standard Time)

Hi,

In all the other augmentation experiments, unless specified, dropout is also applied as it is part of the standard Transformer training.

Shuwen27 · Answer 2 · Tue Oct 17 2023 01:50:56 GMT+0800 (China Standard Time)

Thank you for your quick response! :)