nielstron / quantulum3

Library for unit extraction - fork of quantulum for python3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Preceding Unit Parsing

fschlatt opened this issue · comments

I'm using quantulum to extract time units (5 days, 1 month, etc.), but ran into the issue that in my particular case, units are often given in reverse order: e.g. something happened on day 5. I've therefore locally added a toggle to the parser to allow for inverse unit matching.

When the toggle is set to True, it also checks for quantities and units where the unit is in front of the quantity. However, in cases where there is a unit in front and behind a quantity, it will always prefer the default ordering of quantity followed by unit.

I would be more than happy to prepare a PR, if this functionality is desired.

Sounds like a cool improvement. However, I don't see it connecting to the main theme of the package.

"On day 5 they went out to scout." does not really imply a quantity of days being referred to - it is a reference to a point of time (5 days after a fixed date). (related to a more general issue as #4)

Do you know of any other possible cases where units would be placed in front of the number?

I am happy to leave this open and reference your branch. If more people consider this feature helpful it may be merged.

However, I don't see it connecting to the main theme of the package.

That was my thinking exactly. I have a specific use case where preceding unit parsing is helpful, but I assume it will hurt precision in the general case. Just thought it might help other people with the same exact problem. I've found a couple of bugs and will need to clean up some stuff, but will make the fork public as soon as I'm finished. I'll let you know ;)

There are multiple edge cases that I can't seem to fix, but I also cannot afford to spend more time on. In case anyone runs across this thread, I'll be glad to share the fork, but in its current state I don't feel comfortable making it public. This may change in the future when I found more time to work on it.

Finally did get around to fixing stuff. Here is the fork and branch in case anyone runs into a similar use case: https://github.com/fschlatt/quantulum3/tree/inverse.