Lightning-Universe / lightning-flash

Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains

Home Page:https://lightning-flash.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ERROR: Could not build wheels for pandas, scikit-learn, which is required to install pyproject.toml-based projects

davidgilbertson opened this issue · comments

🐛 Bug

To Reproduce

pip install lightning-flash[tabular] has a lot of errors. There's thousands of lines of errors so I'm not sure which parts to share.

A sample.

building 'pandas._libs.algos' extension
      creating build/temp.linux-x86_64-cpython-310
      creating build/temp.linux-x86_64-cpython-310/pandas
      creating build/temp.linux-x86_64-cpython-310/pandas/_libs
      x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -DNPY_NO_DEPRECATED_API=0 -I./pandas/_libs -Ipandas/_libs/src/klib -I/tmp/pip-build-env-0xcqjj9j/overlay/lib/python3.10/site-packages/numpy/core/include -I/home/davidg/.virtualenvs/learning/include -I/usr/include/python3.10 -c pandas/_libs/algos.c -o build/temp.linux-x86_64-cpython-310/pandas/_libs/algos.o
      pandas/_libs/algos.c:42:10: fatal error: Python.h: No such file or directory
         42 | #include "Python.h"
            |          ^~~~~~~~~~
      compilation terminated.
      error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pandas
  Building wheel for scikit-learn (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for scikit-learn (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [1003 lines of output]
      Partial import of sklearn during the build process.
      <string>:116: DeprecationWarning:

        `numpy.distutils` is deprecated since NumPy 1.23.0, as a result
        of the deprecation of `distutils` itself. It will be removed for
        Python >= 3.12. For older Python versions it will remain present.
        It is recommended to use `setuptools < 60.0` for those Python versions.
        For more details, see:
          https://numpy.org/devdocs/reference/distutils_status_migration.html


      INFO: C compiler: x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC

And, an interesting part...?

In file included from sklearn/svm/src/libsvm/libsvm_template.cpp:6:
      sklearn/svm/src/libsvm/svm.cpp: In function ‘const char* svm_check_parameter(const svm_problem*, const svm_parameter*)’:
      sklearn/svm/src/libsvm/svm.cpp:3130:11: warning: ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
       3130 |    msg =  "Invalid input - all samples have zero or negative weights.";

And the final lines of the error:

INFO:
      ########### CLIB COMPILER OPTIMIZATION ###########
      INFO: Platform      :
        Architecture: x64
        Compiler    : gcc

      CPU baseline  :
        Requested   : 'min'
        Enabled     : SSE SSE2 SSE3
        Flags       : -msse -msse2 -msse3
        Extra checks: none

      CPU dispatch  :
        Requested   : 'max -xop -fma4'
        Enabled     : SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F AVX512CD AVX512_KNL AVX512_KNM AVX512_SKX AVX512_CLX AVX512_CNL AVX512_ICL
        Generated   : none
      INFO: CCompilerOpt.cache_flush[857] : write cache to path -> /tmp/pip-install-2tylos6b/scikit-learn_9f92637d15f141dfa069c8954878de3b/build/temp.linux-x86_64-cpython-310/ccompiler_opt_cache_clib.py
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for scikit-learn
Failed to build pandas scikit-learn
ERROR: Could not build wheels for pandas, scikit-learn, which is required to install pyproject.toml-based projects

I already have Pandas 1.4.4 and scikit-learn 1.1.2 installed.

I notice it's trying to install quite an old version of scikit-learn (0.24), can that be avoided?

Environment

  • OS (e.g., Linux): Windows 11 host, running in WSL2 (Ubuntu).
  • Python version: 3.10.7
  • PyTorch/Lightning/Flash Version (e.g., 1.10/1.5/0.7):
    • torch: 1.12.1+cu116
    • pytorch-lightning: 1.7.7
    • lightning-flash: 0.8.0
  • GPU models and configuration: RTX3090
  • Any other relevant information: From stack overflow, I see things about needing to do sudo apt-get install python3-dev to avoid the error about Python.h. Surely I don't need to do that though, just to get an old version of a package I already have. And if so, should that be in the docs?

Additional context

pip install lightning-flash works fine.

The above errors are all in WSL/Ubuntu. I have the same packages installed in my Windows machine (although they'll vary in minor versions) and there I get a different error. First, an error about not having C++ tools installed. I installed those. Then, running pip install lightning-flash[tabular] I get another wall of errors. The last part is:

 building 'pandas._libs.parsers' extension
      "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.33.31629\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DNPY_NO_DEPRECATED_API=0 -I.\pandas\_libs -Ipandas/_libs/src/klib -Ipandas/_libs/src -IC:\Users\david\AppData\Local\Temp\pip-build-env-mp9k3z9t\overlay\Lib\site-packages\numpy\core\include "-IC:\Program Files\Python310\include" "-IC:\Program Files\Python310\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.33.31629\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" /Tcpandas/_libs/src/parser/io.c /Fobuild\temp.win-amd64-cpython-310\Release\pandas/_libs/src/parser/io.obj
      io.c
      pandas/_libs/src/parser/io.c(139): error C2065: 'ssize_t': undeclared identifier
      pandas/_libs/src/parser/io.c(139): error C2146: syntax error: missing ';' before identifier 'rv'
      pandas/_libs/src/parser/io.c(139): error C2065: 'rv': undeclared identifier
      pandas/_libs/src/parser/io.c(145): error C2065: 'rv': undeclared identifier
      pandas/_libs/src/parser/io.c(145): warning C4267: 'function': conversion from 'size_t' to 'unsigned int', possible loss of data
      pandas/_libs/src/parser/io.c(146): error C2065: 'rv': undeclared identifier
      pandas/_libs/src/parser/io.c(157): error C2065: 'rv': undeclared identifier
      pandas/_libs/src/parser/io.c(158): error C2065: 'rv': undeclared identifier
      error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.33.31629\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pandas
  Building wheel for antlr4-python3-runtime (setup.py) ... done
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141210 sha256=a2f7093bde5cb7466fd2d761d5726bd87d75f82bd0bb277b2f9ee7d8d2357232
  Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\a7\20\bd\e1477d664f22d99989fd28ee1a43d6633dddb5cb9e801350d5
  Building wheel for scikit-learn (pyproject.toml) ... done
  Created wheel for scikit-learn: filename=scikit_learn-0.24.2-cp310-cp310-win_amd64.whl size=6269569 sha256=b0d4460d7c5b65d8daf43e24fb43b2bb641eeed2c55a66d61297c2469797aaf5
  Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\13\a4\68\4e78865652fa14db4a162b491e5138565f97646f9e1f2ab8cc
  Building wheel for PyYAML (pyproject.toml) ... done
  Created wheel for PyYAML: filename=PyYAML-5.4.1-cp310-cp310-win_amd64.whl size=45655 sha256=e80ae26593bbca066131a7044857c731fa526bed252d2a87ea8e96aee806bd97
  Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\c7\0d\22\696ee92245ad710f506eee79bb05c740d8abccd3ecdb778683
  Building wheel for pyperclip (setup.py) ... done
  Created wheel for pyperclip: filename=pyperclip-1.8.2-py3-none-any.whl size=11123 sha256=202846dfde61c94edf0b1755f7e41a5f5aafa99d701896228eadcee6dff07581
  Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\04\24\fe\140a94a7f1036003ede94579e6b4227fe96c840c6f4dcbe307
Successfully built pytorch-tabular antlr4-python3-runtime scikit-learn PyYAML pyperclip
Failed to build pandas
ERROR: Could not build wheels for pandas, which is required to install pyproject.toml-based projects

And to reiterate: I already have Pandas installed, this is blowing up trying to install a very old version of Pandas.

This is probably not related, but I notice that it downloads an older version of torch.
image

I already have things working fine with torch and pytorch-lightning, I'm just interested in trying out the "easy" lightning-flash. But so far can't get it installed in Windows OR Ubuntu. And I certainly don't want to go uninstalling packages that I have working so that they can be replaced with old versions being requested by flash.

Any ideas?

Hey, @davidgilbertson - Sorry that you are facing this issue. Since it says Python.h not found, looks like it's missing the header file required - if you are using Ubuntu, and I assume you must be using apt package manager, so can you please try: sudo apt install libpython3.x-dev (where x will be your python 3.x version, for me it was 3.10 so I had to do: sudo apt install libpython3.10-dev).

Please let me know if this solves or doesn't solve your issue.

Thanks @krshrimali I tried as you suggested and get more errors. I tried with --fix-missing and got the same errors. Tried sudo apt update and that resolved the errors.

Anyway, now I get new errors. It seems like this uninstalled my pytorch-lightning@1.7.7 and installed pytorch-lightning@1.3.6 and then I get an error:

lightning-bolts 0.5.0 requires pytorch-lightning>=1.4.0, but you have pytorch-lightning 1.3.6 which is incompatible.

And also I get the below error, not surprising since lightning-flash brings in such an old scikit-learn.

yellowbrick 1.5 requires scikit-learn>=1.0.0, but you have scikit-learn 0.24.2 which is incompatible.

If I may offer a suggestion: if this is to be a package aimed at being 'easy' to use, it really needs to work without having to install libpython-dev and really needs to avoid installing very old dependencies.

Specifically, you could:

  • use the current version of pytorch-forecasting, which doesn't require scikit-learn>=0.23,<0.25
  • pytorch-tabular has a hard requirement of pandas==1.1.5. They've relaxed this in their source code but haven't released a new version, you could push them to do so and use that version. Or, since that package seems pretty inactive, consider if you need it at all.

Then installs would be nice and smooth, probably