Subtle issue in generating batches

Question

Subtle issue in generating batches

alanramponi opened this issue 3 years ago · comments

I accidentally found that in the current MaChAmp version some data instances are not actually processed during training. This could potentially lead to an underestimation of the performance across all tasks. Specifically, I found that num_batches examples are not included during training. As a result, the smaller the batch size is, the largest the number of examples that are ignored is - a bugfix here is likely to improve performance here and there.

[This issue has been created to alert users of the recommended new version]

alan · Answer 1 · Fri Oct 15 2021 05:05:49 GMT+0800 (China Standard Time)

The issue has been solved here: commit 0de3f33 . In a nutshell, the first example in each batch was accidentally ignored. We suggest to update MachAmp to this commit since we expect improved scores for all the tasks :-) Closing!