nielstron / quantulum3

Library for unit extraction - fork of quantulum for python3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Computes average of a pair of quantities rather than detecting two distinct quantities, when only the second quantity is succeeded by unit name.

iflament opened this issue · comments

commented

Describe the bug
When parsing a sentence containing a range with the format: "N to N unit", the average of the two quantities is returned, rather than two distinct quantities. It seems this is an effect of the first quantity not being detected without a unit.

(Is this a bug or voluntary behavior?)

To Reproduce
Steps to reproduce the behavior:

$ text = 'Relative humidity will drop to 18 to 25 percent this afternoon and early evening over areas south of a Hamilton to Macon to Louisville line.'
$ quants = parser.parse(text)

[Quantity(21.5, "Unit(name="percentage", entity=Entity("dimensionless"), uri=Percentage)")]

Expected behavior

[
Quantity(18, "Unit(name="percentage", entity=Entity("dimensionless"), 
uri=Percentage)"),
Quantity(25, "Unit(name="percentage", entity=Entity("dimensionless"), 
uri=Percentage)")
]

Expected the above two distinct quantities to be detected.

The correct behavior is observed when each quantity is succeeded by its unit:

Relative humidity will drop 18 percent to 25 percent this afternoon and early evening over 
areas south of a Hamilton to Macon to Louisville line.

Outputs: 
[
    Quantity(18, "Unit(name="percentage", entity=Entity("dimensionless"), uri=Percentage)"),
    Quantity(25, "Unit(name="percentage", entity=Entity("dimensionless"), uri=Percentage)")
]

Additional information:

  • Python Version: 3.7.4
  • Classifier activated/ sklearn installed: no

This is intended behaviour. The value "uncertainty" is set accordingly :)

EDIT: The fact that this does not occur with the units is kind of undesired behavior. However in general an average can not be computed with different units, so the attempt to compute it is aborted when units are explicitly given on both parts.