scoder / acora

Fast multi-keyword search engine for text strings

Home Page:http://pypi.python.org/pypi/acora

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python 3.11 support?

rawouter opened this issue · comments

Hello, is this project dead or is there a plan to support 3.11?

We've been using acora extensively in our projects. We are now migrating some important user scripts to run in a python3.11 environment. We decided to build it manually and this worked fine with Cython 0.29.36. However we recently noticed some important slowness as we started building from Cython 3.0.0, and I'm unable to build using 3.0.1 or 3.0.2. We could parse a specific file in about 0.5s vs 18s with the slow version build.

The slowness is only observed in python3.11 with Cython >= 3.0.0, I can not reproduce it in python3.9.

Here is the test I made, with pytest-benchmark:

from acora import AcoraBuilder
import string


def acora_find_all():
    builder = AcoraBuilder("ab")
    ac = builder.build()
    _ = ac.findall(string.ascii_lowercase * 100000)


def test_acora_findall(benchmark):
    benchmark(acora_find_all)

Results:

  • Python3.9 Cython==3.0.2:
---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in ms)         Min      Max     Mean  StdDev   Median     IQR  Outliers      OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
test_acora_findall    24.7092  31.3866  26.0660  1.3501  25.6888  0.7616       3;2  38.3642      23           1
---------------------------------------------------------------------------------------------------------------
  • Python 3.11 Cython==0.29.36
---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in ms)         Min      Max     Mean  StdDev   Median     IQR  Outliers      OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
test_acora_findall    18.3448  22.0383  19.7224  0.7655  19.5438  0.5929       8;5  50.7038      46           1
---------------------------------------------------------------------------------------------------------------
  • Python3.11 Cython==3.0.0
----------------------------------------------- benchmark: 1 tests -----------------------------------------------
Name (time in ms)          Min       Max      Mean  StdDev    Median     IQR  Outliers     OPS  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------
test_acora_findall    173.6256  183.2379  178.6345  3.6809  178.3472  5.8426       2;0  5.5980       6           1
------------------------------------------------------------------------------------------------------------------
commented

is this project dead or is there a plan to support 3.11?

I haven't worked on it in years, and there is no working release script since Travis terminated their free public offer. So, dead until manual resurrection. I'd be happy to accept a PR that adds a new Github Actions wheel building workflow.

I'll take a look at the Cython related problems, though. If there's really an issue with Cython 3.0 here, we might either have fixed it already for Cython 3.0.3 (there were some heavy performance problems) or if not, then it's worth investigating before the release.

In the Cython project, we also removed some performance features from 0.29.x in Py3.11 and 3.12 in order to make the code work there. I'd be hugely surprised if the slowdown was a factor of 10. I'd rather have expected a couple of percent, certainly not factors. But it's difficult to predict what a C compiler makes of a given piece of code in different variations.

I'll take a look. Thanks for providing the benchmark comparisons. I'll try to reproduce them on my side.

commented

I'd be happy to accept a PR that adds a new Github Actions wheel building workflow.

I can probably just copy one from another project.

commented

I've uploaded a new release, acora 2.4. Please try it out.

commented

I also tried some benchmarks and could not see a major difference between Py3.9/10/11, BTW.

Awesome, thanks for the quick turnaround Stefan!
I tried both the benchmark and the problematic snipped of code we isolated (using filefind), we are down from 25s to 1s for the parsing!