GEM-benchmark / NL-Augmenter

NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tests do not Check that Expected and Generated Outputs have Same Number of Sentences

boyleconnor opened this issue · comments

This issue concerns the following line in the main test script:

for pred_output, output in zip(perturbs, outputs):

The zip() builtin (which is used in the above-mentioned line to pair up expected sentences with generated sentences) clips the longer of its two inputted iterables to the length of the shorter iterable. E.g.:

>>> list(zip([1,2,3], [6,7,8,9,10]))
[(1, 6), (2, 7), (3, 8)]

This means that even if a transformation generates fewer sentences (e.g. 0) than the expected number of sentences, it will still pass and the later expected sentences will not get evaluated. This also makes it impossible to test affirmatively that a transformation does not generate any outputs for a given input.

I would recommend either asserting that the two iterables are of equal length, or replacing zip() with zip_longest().