rafelafrance / traiter

Extract information from natural history annotations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Standard length followed by unrelated number

tucotuco opened this issue · comments

Given the inputs

dynamicProperties: Size=218-224mm SL
occurrenceRemarks: 1 spec. removed to SU 69628.

The output is lengthinmm:1 is_inferred:1 standard length type "standard length"

The expected output is lengthinmm: None is_inferred:0 standard length type "standard length"

Clearly the concatenation of the fields with three spaces in between leaves a pattern in this case where the SL is associated with the following number. Is there a distinctive separator that will avoid this association. I tried with each of . ; and |, to no avail.

Another similar example:

Input string: "76 cm S.L., 4.7 kg"

The output is lengthinmm: 4.7 is_inferred:1 standard length type "standard length"

The expected output is lengthinmm: 760 is_inferred:0 standard length type "standard length"

Probably another example of the same phenomenon, the input string (one field): "Size=105 mm TL, 87.1 mm PCL"

commented

Fixed but we should lookout for other cases where this occurs.