nielstron / quantulum3

Library for unit extraction - fork of quantulum for python3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Returned span is nondeterministic in some cases

jelmerdus opened this issue · comments

Describe the bug
In some cases, the returned span changes when the program is started multiple times. Within one run, the results are always the same.

To Reproduce
Run this program 10 times. The result will be [24,25] some times and [24,27] other times.

from quantulum3 import parser
matches = parser.parse("CAVO TRECCIA MARRONE MM 3 x")
print(matches[0].span)

Expected behavior
When using quantulum in a larger piece of software, nondeterministic behavior makes it almost impossible to debug. It is much better to be consistently wrong in some cases than to be nondeterministc.

Screenshots

Additional information:

  • Python Version: Python 3.10, sklearn 1.2.2
  • Classifier activated/ sklearn installed: no/yes
  • OS: Windows 10
  • Version 0.7.9 and 0.9.0

Additional context
Add any other context about the problem here.

Thanks for reporting this! Nondeterministic behaviour is definitely not desireable. Based on what you reported it's likely due to hash-based dictionary ordering not being fully deterministic.