nielstron / quantulum3

Library for unit extraction - fork of quantulum for python3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Per cent parsing

kkatsano opened this issue · comments

I tried to parse the string containing percentage in it.
quants=parser.parse("N2H4 hydrate, (80 per cent)")

The parser identifies it, however, assigns to the entity of "unknown".
[Quantity(80, "Unit(name="per cent", entity=Entity("unknown"), uri=None)")]

When I am accessing the result (or quants[0].unit) using f.e. print(quants[0]), it throws me an error
File "C:\Users...\AppData\Local\pypoetry\Cache\virtualenvs\more-h3_iVl1U-py3.9\lib\site-packages\quantulum3_lang\en_US\load.py", line 24, in pluralize
return PLURALS.plural(singular, count)
File "pydantic\decorator.py", line 40, in pydantic.decorator.validate_arguments.validate.wrapper_function
File "pydantic\decorator.py", line 133, in pydantic.decorator.ValidatedFunction.call
File "pydantic\decorator.py", line 130, in pydantic.decorator.ValidatedFunction.init_model_instance
File "pydantic\main.py", line 342, in pydantic.main.BaseModel.init
pydantic.error_wrappers.ValidationError: 1 validation error for Plural
text
ensure this value has at least 1 characters (type=value_error.any_str.min_length; limit_value=1)

I looked at the units.json and tried to manually type "per cent" in the symbols keys but it resolved nothing.
"percentage": {
"surfaces": [
"percentage",
"percent",
"per cent"
],
"entity": "dimensionless",
"URI": "Percentage",
"dimensions": [],
"symbols": [
"%",
"pct",
"pct.",
"per cent"
]
**Would you be able to help to resolve this issue?

Here is the list of my dependencies:**
tool.poetry.dependencies]
python = "^3.9"
pandas = "^1.5.1"
Pint = "^0.20.1"
quantulum3 = "^0.7.11"
stemming = "^1.0.1"
numpy = "^1.23.4"
scipy = "^1.9.3"
sklearn = "^0.0"

Thanks for raising this issue. I assume that there is a conflict with the interpretation of "per" as in "mm per inch" and "cent" as dollar unit resulting in an unknown unit interpretation. Investigating in https://github.com/nielstron/quantulum3/tree/fix/percent