seomoz / reppy

Modern robots.txt Parser for Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GCC fails when trying to install reppy through pip

dcfreire opened this issue · comments

I cannot install reppy in either python 3.8 or 3.10.

# pip3.8 install reppy                                                                                                                                                                  
Collecting reppy
  Using cached reppy-0.4.14.tar.gz (93 kB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: cachetools in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from reppy) (5.0.0)
Requirement already satisfied: python-dateutil!=2.0,>=1.5 in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from reppy) (2.8.2)
Requirement already satisfied: requests in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from reppy) (2.27.1)
Requirement already satisfied: six in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from reppy) (1.16.0)
Requirement already satisfied: certifi>=2017.4.17 in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from requests->reppy) (2021.10.8)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from requests->reppy) (1.26.9)
Requirement already satisfied: idna<4,>=2.5 in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from requests->reppy) (3.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages (from requests->reppy) (2.0.12)
Building wheels for collected packages: reppy
  Building wheel for reppy (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /home/shi/.virtualenvs/ir-1/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/setup.py'"'"'; __file__='"'"'/tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-40ydxajn
       cwd: /tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/
  Complete output (42 lines):
  Building from C++
  /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages/setuptools/dist.py:723: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
    warnings.warn(
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.8
  creating build/lib.linux-x86_64-3.8/reppy
  copying reppy/ttl.py -> build/lib.linux-x86_64-3.8/reppy
  copying reppy/exceptions.py -> build/lib.linux-x86_64-3.8/reppy
  copying reppy/util.py -> build/lib.linux-x86_64-3.8/reppy
  copying reppy/__init__.py -> build/lib.linux-x86_64-3.8/reppy
  creating build/lib.linux-x86_64-3.8/reppy/cache
  copying reppy/cache/policy.py -> build/lib.linux-x86_64-3.8/reppy/cache
  copying reppy/cache/__init__.py -> build/lib.linux-x86_64-3.8/reppy/cache
  running build_ext
  creating build/temp.linux-x86_64-3.8
  creating build/temp.linux-x86_64-3.8/reppy
  creating build/temp.linux-x86_64-3.8/reppy/rep-cpp
  creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps
  creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps/url-cpp
  creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps/url-cpp/src
  creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/src
  gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DOPENSSL_NO_SSL2 -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -fPIC -Ireppy/rep-cpp/include -Ireppy/rep-cpp/deps/url-cpp/include -I/home/shi/.virtualenvs/ir-1/include -I/usr/include/python3.8 -c reppy/rep-cpp/deps/url-cpp/src/psl.cpp -o build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps/url-cpp/src/psl.o -std=c++11
  In file included from reppy/rep-cpp/deps/url-cpp/src/psl.cpp:7:
  reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:54: error: ‘numeric_limits’ is not a member of ‘std’
     56 |         const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
        |                                                      ^~~~~~~~~~~~~~
  reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:82: error: expected primary-expression before ‘>’ token
     56 |         const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
        |                                                                                  ^
  reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:85: error: ‘::max’ has not been declared; did you mean ‘std::max’?
     56 |         const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
        |                                                                                     ^~~
        |                                                                                     std::max
  In file included from /usr/include/c++/11.2.0/algorithm:62,
                   from reppy/rep-cpp/deps/url-cpp/src/psl.cpp:1:
  /usr/include/c++/11.2.0/bits/stl_algo.h:3467:5: note: ‘std::max’ declared here
   3467 |     max(initializer_list<_Tp> __l, _Compare __comp)
        |     ^~~
  error: command '/usr/bin/gcc' failed with exit code 1
  ----------------------------------------
  ERROR: Failed building wheel for reppy
  Running setup.py clean for reppy
Failed to build reppy
Installing collected packages: reppy
    Running setup.py install for reppy ... error
    ERROR: Command errored out with exit status 1:
     command: /home/shi/.virtualenvs/ir-1/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/setup.py'"'"'; __file__='"'"'/tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-rsopie2y/install-record.txt --single-version-externally-managed --compile --install-headers /home/shi/.virtualenvs/ir-1/include/site/python3.8/reppy
         cwd: /tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/
    Complete output (44 lines):
    Building from C++
    /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages/setuptools/dist.py:723: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
      warnings.warn(
    running install
    /home/shi/.virtualenvs/ir-1/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
      warnings.warn(
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.8
    creating build/lib.linux-x86_64-3.8/reppy
    copying reppy/ttl.py -> build/lib.linux-x86_64-3.8/reppy
    copying reppy/exceptions.py -> build/lib.linux-x86_64-3.8/reppy
    copying reppy/util.py -> build/lib.linux-x86_64-3.8/reppy
    copying reppy/__init__.py -> build/lib.linux-x86_64-3.8/reppy
    creating build/lib.linux-x86_64-3.8/reppy/cache
    copying reppy/cache/policy.py -> build/lib.linux-x86_64-3.8/reppy/cache
    copying reppy/cache/__init__.py -> build/lib.linux-x86_64-3.8/reppy/cache
    running build_ext
    creating build/temp.linux-x86_64-3.8
    creating build/temp.linux-x86_64-3.8/reppy
    creating build/temp.linux-x86_64-3.8/reppy/rep-cpp
    creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps
    creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps/url-cpp
    creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps/url-cpp/src
    creating build/temp.linux-x86_64-3.8/reppy/rep-cpp/src
    gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DOPENSSL_NO_SSL2 -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -fPIC -Ireppy/rep-cpp/include -Ireppy/rep-cpp/deps/url-cpp/include -I/home/shi/.virtualenvs/ir-1/include -I/usr/include/python3.8 -c reppy/rep-cpp/deps/url-cpp/src/psl.cpp -o build/temp.linux-x86_64-3.8/reppy/rep-cpp/deps/url-cpp/src/psl.o -std=c++11
    In file included from reppy/rep-cpp/deps/url-cpp/src/psl.cpp:7:
    reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:54: error: ‘numeric_limits’ is not a member of ‘std’
       56 |         const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
          |                                                      ^~~~~~~~~~~~~~
    reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:82: error: expected primary-expression before ‘>’ token
       56 |         const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
          |                                                                                  ^
    reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:85: error: ‘::max’ has not been declared; did you mean ‘std::max’?
       56 |         const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
          |                                                                                     ^~~
          |                                                                                     std::max
    In file included from /usr/include/c++/11.2.0/algorithm:62,
                     from reppy/rep-cpp/deps/url-cpp/src/psl.cpp:1:
    /usr/include/c++/11.2.0/bits/stl_algo.h:3467:5: note: ‘std::max’ declared here
     3467 |     max(initializer_list<_Tp> __l, _Compare __comp)
          |     ^~~
    error: command '/usr/bin/gcc' failed with exit code 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /home/shi/.virtualenvs/ir-1/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/setup.py'"'"'; __file__='"'"'/tmp/pip-install-b0oy_bsz/reppy_b67f12bbf68043dab8f5014f214e8614/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-rsopie2y/install-record.txt --single-version-externally-managed --compile --install-headers /home/shi/.virtualenvs/ir-1/include/site/python3.8/reppy Check the logs for full command output.

I built reppy from source for now. The issue is the non-inclusion of the <limits> library on punycode.h, which is a header file on the url-cpp dependency. (As per the error message)

Workaround: use CLang:

CC=clang pip install reppy

The usual python solution is to have a wheel, so people installing don't have to think about which compiler to use.

Interestingly, using CLang does not work for me when using Python 2.7, but it works with Python 3.8.

find a solution:
add following code in the front of the file reppy/rep-cpp/deps/url-cpp/include/punycode.h :

#include <stdexcept>
#include <limits>

@diyanqi
Thanks - IMO this would be better fixed in https://github.com/seomoz/url-cpp, i.e. here

https://github.com/seomoz/url-cpp/blob/master/include/punycode.h

Then, as this is a Git submodule in here, the frozen version in Reppy would need to be updated.

But I think the better way is retiring Reppy entirely and using the official Python module instead:

https://docs.python.org/3.10/library/urllib.robotparser.html#module-urllib.robotparser

@Gallaecio Thanks for the good pointer!

However, what shall we do with a broken module that is no longer maintained? Is there an official or at least stable fork of Reppy?

Also, in the age of large-scale Supply Chain Attacks on Open-Source libraries, I think many developers will prefer a standard library module than one that is no longer maintained and has outdated dependencies. At least the frozen version for requests has quite some vulnerabilities (not super-critical ones, but still).

CLang does not work for me on GitHub Actions either, same error as with GCC.

clang -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ireppy/rep-cpp/include -Ireppy/rep-cpp/deps/url-cpp/include -I/home/runner/work/scrapy/scrapy/.tox/pylint/include -I/opt/hostedtoolcache/Python/3.8.15/x64/include/python3.8 -c reppy/rep-cpp/deps/url-cpp/src/psl.cpp -o build/temp.linux-x86_64-cpython-38/reppy/rep-cpp/deps/url-cpp/src/psl.o -std=c++11
      In file included from reppy/rep-cpp/deps/url-cpp/src/psl.cpp:7:
      reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:54: error: no member named 'numeric_limits' in namespace 'std'
              const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
                                                      ~~~~~^
      reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:69: error: unexpected type name 'punycode_uint': expected expression
              const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
                                                                          ^
      reppy/rep-cpp/deps/url-cpp/include/punycode.h:56:85: error: no member named 'max' in the global namespace
              const punycode_uint MAX_PUNYCODE_UINT = std::numeric_limits<punycode_uint>::max();
                                                                                        ~~^
      3 errors generated.
      error: command '/usr/bin/clang' failed with exit code 1

#132 (comment) seems like the only solution, but I don‘t think there is a clean way to patch like that while still using pip to install.