kno10 / python-kmedoids

Fast K-Medoids clustering in Python with FasterPAM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wheel not pre-built for manylinux1 (AWS Lambda)

davnn opened this issue · comments

Interestingly, using the newly release version 0.3.2, the install fails on the default Linux Github Actions runnerm, because it attempts a source install (no appropriate wheel found?).

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      Cargo, the Rust package manager, is not installed or is not on PATH.
      This package requires Rust and Cargo to compile extensions. Install it through
      the system's package manager or via https://rustup.rs/
      Checking for Rust toolchain....
      [end of output]

Cannot locally reproduce on Linux or Windows. Might find time to further track down the issue, but maybe you have an idea?

Which pip version is installed? Does updating pip first help - I have seen pip update calls in many github actions scripts.
Maybe the binary pip tags were only added in pip >= 20.3 with PEP 600 support.
I don't think we should be adding pip >= 20.3 to requirements, hence all we then possibly can do is to document this in the instructions. (Supposedly is fine on ALT Linux 10+, RHEL 9+, Debian 11+, Fedora 34+, Mageia 8+, Ubuntu 21.04+ then; but then possibly not yet on Ubuntu 20.04.4 LTS - we could then try to also build manylinux2014 tags)

I just found out that it's not the fault of github runner, but there appears to be no suitable release for the AWS Lambda docker image for 0.3.2, but for 0.3.1 there is. The following docker container fails to build:

FROM public.ecr.aws/lambda/python:3.9
RUN pip3 install --upgrade pip
RUN pip3 install kmedoids==0.3.2

Next example builds just fine.

FROM public.ecr.aws/lambda/python:3.9
RUN pip3 install --upgrade pip
RUN pip3 install kmedoids==0.3.1

I couldn't find out yet what the AWS Lambda docker image is based on, unfortunately.

Edit: The corresponding Dockerfile for the AWS image does not provide much insight either.

The 0.3.1 seems to have generated kmedoids-0.3.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl wheels,
while 0.3.2 only has newer compatibility kmedoids-0.3.2-cp39-cp39-manylinux_2_28_x86_64.whl wheels.
I do not know why the older compatibility level manylinux1 was not generated.
The AWS images appear to be pretty much undocumented, just some tarred artifacts stored in Git LFS, unfortunately.

Indeed the manylinux documentation mentions that manylinux1 is compatible Amazon Linux 1: https://github.com/pypa/manylinux
and that "Support for manylinux1 will end on January 1st 2022". So maybe they were disabled by some upstream change that affected the 0.3.2 build.

So the question is how to release manylinux1, manylinux2010, manylinux2014 and manylinux_x_y packages simultaneously.

The maturin docs mention that If you want to publish widely usable wheels for linux pypi, you need to use a manylinux docker image. As a real-world example maturin refers to the fast-ctc-decode package. Their build workflow looks really clean, and, looking at the built wheels, they seem to be able to support all manylinux variants (except manylinux2014?).

We should probably align our Github action with the one fast-ctc-decode uses.

We have been using the same container as for 0.3.1, and manylinux1 was no longer built. So the change that cause manylinux1 to be dropped may have been in any of the dependencies. The question is whether this is still worth it, if manylinux1 is ca. 2016; and support is already discontinued.

I tried building on the manylinux1 containers, but that just causes dependency hell, as various github actions no longer work on such old OS. scipy and scikit-learn appear to have also stopped supporting manylinux1. E.g., the checkout action fails because node has linking issues, or the python install action fails to install Python 3.8:
/__t/Python/3.8.13/x64/bin/pip: /opt/hostedtoolcache/Python/3.8.13/x64/bin/python: bad interpreter: No such file or directory

It appears that scikit-learn dropped manylinux2010 support with 1.1.0 and numpy dropped manylinux2010 support with 1.22.0. I would propose to align with the numpy manylinux support for future releases, i.e. we should try to provide manylinux2014 and manylinux2 support for now. I will shortly submit a pull request to add support.

Pull request looks good. Do these packages install on AWS Lamda images?

Yes, it's not really documented, but apparently they support 2014+.