huggingface / tokenizers

πŸ’₯ Fast State-of-the-Art Tokenizers optimized for Research and Production

Home Page:https://huggingface.co/docs/tokenizers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ERROR: Failed building wheel for tokenizers

outdoorblake opened this issue Β· comments

commented

System Info

I can't seem to get past this error "ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects" when installing transformers with pip. An ML friend of mine also tried on their own instance and encountered the same problem, tried to help troubleshoot with me and we weren't able to move past so I think its possibly a recent issue.

I am following the transformers README install instructions step by step, with a venv and pytorch ready to go. Pip is also fully up to date. In this error output one prompt it says is to possibly install a rust compiler - but we both felt this doesn't seem like the right next step because it usually isn't required when installing the transformers package and the README has no mention of needing to install a rust compiler.

Thanks in advance!
-Blake

Full output below:

command: pip install transformers

Collecting transformers
Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB)
Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0)
Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0)
Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17)
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
Using cached tokenizers-0.12.1.tar.gz (220 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2)
Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3)
Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0)
Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7)
Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3)
Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7)
Building wheels for collected packages: tokenizers
Building wheel for tokenizers (pyproject.toml) ... error
error: subprocess-exited-with-error

Γ— Building wheel for tokenizers (pyproject.toml) did not run successfully.
β”‚ exit code: 1
╰─> [51 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-12-arm64-cpython-39
creating build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
running build_ext
running build_rust
error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

command: pip install transformers

Collecting transformers
Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB)
Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0)
Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0)
Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17)
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
Using cached tokenizers-0.12.1.tar.gz (220 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2)
Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3)
Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0)
Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7)
Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3)
Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7)
Building wheels for collected packages: tokenizers
Building wheel for tokenizers (pyproject.toml) ... error
error: subprocess-exited-with-error

Γ— Building wheel for tokenizers (pyproject.toml) did not run successfully.
β”‚ exit code: 1
╰─> [51 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-12-arm64-cpython-39
creating build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations
creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers
copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models
copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders
copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers
copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers
copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors
copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers
copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools
running build_ext
running build_rust
error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Expected behavior

I would expect transformers library to install without throwing an error when all pre-requisites for installation are met.

commented

I am aware of this past issue - it is very similar but these suggested fixes seem dated and are not working.

Let me move this over to tokenizers, which should be in a better position to help.

also having this issue, hadn't ran into it before.

Are you guys on M1 ?
If that's the case it's expected unfortunately. (#932)

If not, what platform are you on ? (OS, hardware, python version ?)

Basically for M1 you need to install from source (for now, fixes coming soon #1055).

Also the error message says you're missing a rust compiler, it might be enough to just install the rust compiler: https://www.rust-lang.org/tools/install and maybe the install will go through. (It's easier if we prebuild those but still).

I'm using M2 Apple and I can't install tokenizers. The same thing works on my Linux fine. How can we install tokenizers for M1/M2 Apple Macs?

M1 user here. I got the same error, installing the rust compiler fixed this for me.

It's all the same, we couldn't prebuild the library for m1 (which is an arm64 chip) because github didn't have a arm64 action runner. We did push manually some prebuilt binaries but it seems they contained some issues. Since then, github enabled the runner to work on m1 machines (so all macos+arm64) so hopefully this will be fixed for the next release.

Since this is a "major" release (still not in 1.0) we're going to do a full sweep of slow tests in transformers (which is our biggest user) and hopefully this should come out of the box for m1 onwards after that !

@stephantul where did you get the rust compiler. I installed it from https://www.rust-lang.org/tools/install and pip3 install tokenizers still fails.

@alibrahimzada I installed it with homebrew

@alibrahimzada you might also need pip install setuptools_rust and your python environment needs to be shareable (depends how you installed python basically, for pyenv for instance you will need this: pyenv/pyenv#392

(Careful it's now PYTHON_CONFIGURE_OPT="--enable-shared" pyenv install ... ).

Having the same problem and none of the above suggestions worked. Any ETA on when we can expect the next release that fixes this bug?

I am on M1 and managed to go around this in the following way:
I installed a rust compiler using brew, and then initialized it.
brew install rustup
rustup-init
Then I restarted the console and checked if it is installed: rustc --version . It turned out you also have to setup the path:
export PATH="$HOME/.cargo/bin:$PATH"

I have done everything @Narsil and @anibzlv have suggested. No luck still.. (am on M1, 2021)

Oddly enough, the library works just fine inside a virtual environment on my MBP with the M1 chip. So for now, that's my approach.

I could install from sources and it seems to be working.

Has anyone tried to install the latest version on M1 ? The prebuilt binaries should be released now !

I tried installing on M1 just now in a python3.10 virtual environment. All I had to do was pip install setuptools_rust. Then I could install all the required packages.

I'm running on M2 with Python3.8 and are still running into this problem
Any other workaround than installing from source?

I thought Python 3.8 was not built for M1/M2... So this library cannot build it for you.

Are you sure you are not in compatibility mode and not really using 3.8 ?
https://stackoverflow.com/questions/69511006/cant-install-pyenv-3-8-5-on-macos-big-sur-with-m1-chip

Try telling pip to use prefer binary, it'll probably give you an older version of tokenizer but you would need to build from source. It does depend on the version requirements for tokenizer.

The proper fix would be for Hugginface to create wheels for Apple Silicon

We already build wheels for Apple Silicon ! Just not python3.8 which isn't supposed to exist on M1. (only 3.9, 3.10, and 3.11 now)

Where's the binary wheel for 0.12.1 , PyPi can't find it.
Having to use 11.6 to avoid having "install rust" as an instruction to install user software.

Github did not provide an action runner at the time for M1, so builds where manual (and infrequent).

Any reason you cannot upgrade to 0.13.2 or 0.12.6 ?

But yes for some older versions the M1 are not present, we're not doing retroactive builds unfortunately.
I'm basically the sole maintainer here, and I don't really have the time to figure out all the old versions for all platforms (but ensuring that once a platform is supported it keeps on working is something we're committed to).

In the project there are a number of other thirty party python modules dependant on tokenizer, from yesterdays build I got the following version dependencies for pip

Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 .

No sure why its not picked in 0.12.6, setting pip to prefer binary installed 0.11.6.

EDIT: answering my own question
https://pypi.org/simple/tokenizers/ goes straight from 0.12.1 to 0.13.0 there is no 0.12.6

Hmm interesting, could you try force installing 0.12.6 and see if that fixes it ?

If you could share your env (Python version + hardware (m1 I guess) + requirements.txt) ?

I don't remember the command but there's a way to make pip explain its decisions regarding versions.

I got confused with 0.11.6 sorry !

And I don't see the builds for 0.12 for arm, I'm guessing we moved to 0.13 first.

TBH there "shouldn't" by any major differences between 0.12.1 and 0.13, so if you could switch that might work (I took caution since we updated PyO3 bindings version and that triggered a lot of code changes, even if we didn't intend any functional changes).

transformers is the one probably limiting tokenizers (we do that to enable tokenizers to make eventual breaking changes).
Maybe you could try updating it ?

It a bit convoluted ATM as currently on different OS's require different version of gfpgan unless you install torch upfront.

So I do

pip install "torch<1.13" "torchvision<1.14"

Main requirements.txt
-r requirements-base.txt

protobuf==3.19.6
torch<1.13.0
torchvision<0.14.0
-e .

requirements-base.txt

pip will resolve the version which matches torch

albumentations
dependency_injector==4.40.0
diffusers
einops
eventlet
flask==2.1.3
flask_cors==3.0.10
flask_socketio==5.3.0
flaskwebgui==0.3.7
getpass_asterisk
gfpgan
huggingface-hub
imageio
imageio-ffmpeg
kornia
numpy
omegaconf
opencv-python
pillow
pip>=22
pudb
pyreadline3
pytorch-lightning==1.7.7
realesrgan
scikit-image>=0.19
send2trash
streamlit
taming-transformers-rom1504
test-tube
torch-fidelity
torchmetrics
transformers==4.21.*
git+https://github.com/openai/CLIP.git@main#egg=clip
git+https://github.com/Birch-san/k-diffusion.git@mps#egg=k-diffusion
git+https://github.com/invoke-ai/clipseg.git@models-rename#egg=clipseg

I'll have to see why we limit transformers assuming the reasoning hasn't been lost to history

I am on M1 and managed to go around this in the following way:
I installed a rust compiler using brew, and then initialized it.
brew install rustup
rustup-init
Then I restarted the console and checked if it is installed: rustc --version . It turned out you also have to setup the path:
export PATH="$HOME/.cargo/bin:$PATH"

I used this way and it worked for me on M2. Thank you so much

For windows:

  • Install Visual Studio (latest version, 2022)
  • install Python workloads
  • install Desktop Development C++ Workloads

Are you guys on M1 ? If that's the case it's expected unfortunately. (#932)

If not, what platform are you on ? (OS, hardware, python version ?)

Basically for M1 you need to install from source (for now, fixes coming soon #1055).

Also the error message says you're missing a rust compiler, it might be enough to just install the rust compiler: https://www.rust-lang.org/tools/install and maybe the install will go through. (It's easier if we prebuild those but still).

I am using M1 & simply installing rust compiler worked in my case.

@emalineg do you mind sharing a little bit more about your setup ?

Python version:
MacOS version:
Conda ?:
etc.. ?

We're trying to prebuild the binaries so you don't have to compile from source on most common platforms, and M1 is now part of that list.

Same bug with Rust on my Macbook Air M2 2022. pip install transformers => Error with Rust

@justlearntutors what is your python version ?

Can you do:
transformers-cli env if you can and paste the output ?

It works now.

I did
brew install rustup
rustup-init
Then I restarted the console and checked if it is installed: rustc --version . It turned out you also have to setup the path:
export PATH="$HOME/.cargo/bin:$PATH"

and

python3 -m pip install transformers
python -m pip install transformers

Observation of Narsil of above helped me on M1: I upgraded from Python3.8 to Python 3.9 and now all is well.

image

same problem on windows 11 using python 3.11

You're entirely correct, somehow we're missing Windows amd_64 for Python 3.11 I'll look into it.

getting same error on android (pydroid 3) while installing transformers - nothing seems to help so far. Maybe you have some ideas?

Can you provide more information on your environment ?

Python version
Cpu version
tokenizers verson
Type of Python (conda, pip, pyenv etc..)

Have you tried installing the rust compiler ?

still error for google colab
!git clone https://github.com/huggingface/transformers
&& cd transformers
&& git checkout a3085020ed0d81d4903c50967687192e3101e770
!pip install ./transformers/

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing ./transformers
Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (1.22.4)
Collecting tokenizers==0.0.11 (from transformers==2.3.0)
Downloading tokenizers-0.0.11.tar.gz (30 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting boto3 (from transformers==2.3.0)
Downloading boto3-1.26.144-py3-none-any.whl (135 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 135.6/135.6 kB 15.5 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (3.12.0)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2.27.1)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (4.65.0)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2022.10.31)
Collecting sentencepiece (from transformers==2.3.0)
Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 51.6 MB/s eta 0:00:00
Collecting sacremoses (from transformers==2.3.0)
Downloading sacremoses-0.0.53.tar.gz (880 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 880.6/880.6 kB 53.7 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting botocore<1.30.0,>=1.29.144 (from boto3->transformers==2.3.0)
Downloading botocore-1.29.144-py3-none-any.whl (10.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 113.7 MB/s eta 0:00:00
Collecting jmespath<2.0.0,>=0.7.1 (from boto3->transformers==2.3.0)
Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Collecting s3transfer<0.7.0,>=0.6.0 (from boto3->transformers==2.3.0)
Downloading s3transfer-0.6.1-py3-none-any.whl (79 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.8/79.8 kB 10.1 MB/s eta 0:00:00
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (1.26.15)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2022.12.7)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2.0.12)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (3.4)
Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.16.0)
Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (8.1.3)
Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.2.0)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/local/lib/python3.10/dist-packages (from botocore<1.30.0,>=1.29.144->boto3->transformers==2.3.0) (2.8.2)
Building wheels for collected packages: transformers, tokenizers, sacremoses
Building wheel for transformers (setup.py) ... done
Created wheel for transformers: filename=transformers-2.3.0-py3-none-any.whl size=458550 sha256=236e7cf5654e4cff65da41ee3a83e39d34fbea6396b8051e9243120a5cae5dde
Stored in directory: /tmp/pip-ephem-wheel-cache-wlywjaz5/wheels/7c/35/80/e946b22a081210c6642e607ed65b2a5b9a4d9259695ee2caf5
error: subprocess-exited-with-error

Γ— Building wheel for tokenizers (pyproject.toml) did not run successfully.
β”‚ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for tokenizers (pyproject.toml) ... error
ERROR: Failed building wheel for tokenizers
Building wheel for sacremoses (setup.py) ... done
Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895241 sha256=099fd152876aa843c9f04a284c7f7c9260d266b181e672796d1619a0f7e2be76
Stored in directory: /root/.cache/pip/wheels/00/24/97/a2ea5324f36bc626e1ea0267f33db6aa80d157ee977e9e42fb
Successfully built transformers sacremoses
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

commented

I would recommend you to install tokenizers version 11.6 instead 0.0.11 that is being fetched with the commit that you checked out.

System Info

I can't seem to get past this error "ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects" when installing transformers with pip. An ML friend of mine also tried on their own instance and encountered the same problem, tried to help troubleshoot with me and we weren't able to move past so I think its possibly a recent issue.

I am following the transformers README install instructions step by step, with a venv and pytorch ready to go. Pip is also fully up to date. In this error output one prompt it says is to possibly install a rust compiler - but we both felt this doesn't seem like the right next step because it usually isn't required when installing the transformers package and the README has no mention of needing to install a rust compiler.

Thanks in advance! -Blake

Full output below:

command: pip install transformers

Collecting transformers Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB) Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0) Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0) Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0) Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17) Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 Using cached tokenizers-0.12.1.tar.gz (220 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2) Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3) Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0) Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9) Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7) Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3) Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8) Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7) Building wheels for collected packages: tokenizers Building wheel for tokenizers (pyproject.toml) ... error error: subprocess-exited-with-error

Γ— Building wheel for tokenizers (pyproject.toml) did not run successfully. β”‚ exit code: 1 ╰─> [51 lines of output] running bdist_wheel running build running build_py creating build creating build/lib.macosx-12-arm64-cpython-39 creating build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools running build_ext running build_rust error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for tokenizers Failed to build tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

command: pip install transformers

Collecting transformers Using cached transformers-4.21.1-py3-none-any.whl (4.7 MB) Requirement already satisfied: tqdm>=4.27 in ./venv/lib/python3.9/site-packages (from transformers) (4.64.0) Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in ./venv/lib/python3.9/site-packages (from transformers) (0.9.0) Requirement already satisfied: pyyaml>=5.1 in ./venv/lib/python3.9/site-packages (from transformers) (6.0) Requirement already satisfied: regex!=2019.12.17 in ./venv/lib/python3.9/site-packages (from transformers) (2022.8.17) Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 Using cached tokenizers-0.12.1.tar.gz (220 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: numpy>=1.17 in ./venv/lib/python3.9/site-packages (from transformers) (1.23.2) Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.9/site-packages (from transformers) (21.3) Requirement already satisfied: filelock in ./venv/lib/python3.9/site-packages (from transformers) (3.8.0) Requirement already satisfied: requests in ./venv/lib/python3.9/site-packages (from transformers) (2.26.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in ./venv/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers) (4.3.0) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in ./venv/lib/python3.9/site-packages (from packaging>=20.0->transformers) (3.0.9) Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./venv/lib/python3.9/site-packages (from requests->transformers) (1.26.7) Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.9/site-packages (from requests->transformers) (3.3) Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2021.10.8) Requirement already satisfied: charset-normalizer~=2.0.0 in ./venv/lib/python3.9/site-packages (from requests->transformers) (2.0.7) Building wheels for collected packages: tokenizers Building wheel for tokenizers (pyproject.toml) ... error error: subprocess-exited-with-error

Γ— Building wheel for tokenizers (pyproject.toml) did not run successfully. β”‚ exit code: 1 ╰─> [51 lines of output] running bdist_wheel running build running build_py creating build creating build/lib.macosx-12-arm64-cpython-39 creating build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/models/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models creating build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/decoders/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders creating build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/normalizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/processors/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors creating build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/trainers/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers creating build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/implementations creating build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/init.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/tools/visualizer.py -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools copying py_src/tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers copying py_src/tokenizers/models/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/models copying py_src/tokenizers/decoders/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/decoders copying py_src/tokenizers/normalizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/normalizers copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/pre_tokenizers copying py_src/tokenizers/processors/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/processors copying py_src/tokenizers/trainers/init.pyi -> build/lib.macosx-12-arm64-cpython-39/tokenizers/trainers copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.macosx-12-arm64-cpython-39/tokenizers/tools running build_ext running build_rust error: can't find Rust compiler

  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for tokenizers Failed to build tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Expected behavior

I would expect transformers library to install without throwing an error when all pre-requisites for installation are met.

i got the same issue with pydroid

hi @Narsil

I'm using Macbook M2 and python 3.11.5, but still encounter the same problem
any related information or work around that I can do for now?

thank you πŸ™

commented

@Teofebano can't you just install the wheels we released online? the following worked for me on a M1, not sure why M2 would be different

conda create -n py3.11 python=3.11
conda activate py3.11
pip install tokenizers
commented

@Teofebano do you need such an 'old' version of transformers ?
The reason you're having this issue is that transformers is requiring a version of tokenizers for which there is no MacOS wheel, which is the problem I had if you scroll up, so it builds from source..

Alternatively install rust so it can be built (no I didn't what to do that either)

commented

Especially if you are using a recent version of python, highly possible that it won't be compatible with old versions of transformers

Any solution now?

commented

Hey @bruce2233 if you have an issue with building wheel, make sure to share a reproducer, a full traceback, the machine you are running this on, and make sure that all the proposed solution did not work for you!

Spent 27 Hours trying to get deepspeed working on a tool to run into this error and be blocked. Tokenizers is already installed, but installing anything else seems to make it try to reinstall. It fails to compile due to a rust issue.

error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib/src/models/bpe/trainer.rs:526:47
          |
      522 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here
      ...
      526 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: `#[deny(invalid_reference_casting)]` on by default

      warning: `tokenizers` (lib) generated 3 warnings
      error: could not compile `tokenizers` (lib) due to previous error; 3 warnings emitted

Albeit this was on WSL2, notorious for failures of a catastrophic degree.

error: could not compile tokenizers (lib) due to previous error; 3 warnings emitted

I get this same error on my M2 MBP on Sonoma.

      error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib/src/models/bpe/trainer.rs:526:47
          |
      522 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here
      ...
      526 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: `#[deny(invalid_reference_casting)]` on by default
      
      warning: `tokenizers` (lib) generated 3 warnings
      error: could not compile `tokenizers` (lib) due to previous error; 3 warnings emitted

Ubuntu 18 LTS, Rust via "pipe internet to shell"... So the compiler is too new? The Ubuntu-included is too old for one of the deps...

For what it's worth: Above setup but Rust version 1.67.1 and -A invalid_reference_casting (via RUSTFLAGS) and it does compile then (haven't yet got to testing if it actually works, though...).

To prevent tokenizers from building using the latest stable rust version and toolchain, which changed invalid_reference_casting lint to deny-by-default from allow-by-default in version 1.73.0, Im using this in the Dockerfile now

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain=1.72.1 -y
ENV PATH="/root/.cargo/bin:${PATH}"
ENV RUSTUP_TOOLCHAIN=1.72.1

For what it's worth: Above setup but Rust version 1.67.1 and -A invalid_reference_casting (via RUSTFLAGS) and it does compile then (haven't yet got to testing if it actually works, though...).

Worked for me on WSL. Thanks! I'll give a bit more detail for this method:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain=1.67.1 -y
source "$HOME/.cargo/env"
RUSTFLAGS="-A invalid_reference_casting"
python3 -m pip install -e ./modules/tortoise-tts/

Which version of tokenizers are you all using ?

This was fixed as soon as 1.73.1 came out in 0.14.1

I met this problem and failed to resolve it with any way mentioned above. However when I downgrade my python version from 3.11 to 3.10, everything went well instead. I hope it could help you.

We already build wheels for Apple Silicon ! Just not python3.8 which isn't supposed to exist on M1. (only 3.9, 3.10, and 3.11 now)

this was a crucial hint thanks!

Or just upgrade your tokenizers versions :)

(We prebuild tokenizers, just like most precompiled python libs, on the current set of valid python versions, AT the time of building those, so old versions are built on old pythons)

commented

Is it working now, should it be closed?

Closing as completed

Problem persists on python 3.12 for me. Windows 11, reverting to python 3.10 worked.

still error for google colab !git clone https://github.com/huggingface/transformers && cd transformers && git checkout a3085020ed0d81d4903c50967687192e3101e770 !pip install ./transformers/

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Processing ./transformers Preparing metadata (setup.py) ... done Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (1.22.4) Collecting tokenizers==0.0.11 (from transformers==2.3.0) Downloading tokenizers-0.0.11.tar.gz (30 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting boto3 (from transformers==2.3.0) Downloading boto3-1.26.144-py3-none-any.whl (135 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 135.6/135.6 kB 15.5 MB/s eta 0:00:00 Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (3.12.0) Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2.27.1) Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (4.65.0) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers==2.3.0) (2022.10.31) Collecting sentencepiece (from transformers==2.3.0) Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 51.6 MB/s eta 0:00:00 Collecting sacremoses (from transformers==2.3.0) Downloading sacremoses-0.0.53.tar.gz (880 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 880.6/880.6 kB 53.7 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting botocore<1.30.0,>=1.29.144 (from boto3->transformers==2.3.0) Downloading botocore-1.29.144-py3-none-any.whl (10.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 113.7 MB/s eta 0:00:00 Collecting jmespath<2.0.0,>=0.7.1 (from boto3->transformers==2.3.0) Downloading jmespath-1.0.1-py3-none-any.whl (20 kB) Collecting s3transfer<0.7.0,>=0.6.0 (from boto3->transformers==2.3.0) Downloading s3transfer-0.6.1-py3-none-any.whl (79 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.8/79.8 kB 10.1 MB/s eta 0:00:00 Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (1.26.15) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2022.12.7) Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (2.0.12) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==2.3.0) (3.4) Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.16.0) Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (8.1.3) Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from sacremoses->transformers==2.3.0) (1.2.0) Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/local/lib/python3.10/dist-packages (from botocore<1.30.0,>=1.29.144->boto3->transformers==2.3.0) (2.8.2) Building wheels for collected packages: transformers, tokenizers, sacremoses Building wheel for transformers (setup.py) ... done Created wheel for transformers: filename=transformers-2.3.0-py3-none-any.whl size=458550 sha256=236e7cf5654e4cff65da41ee3a83e39d34fbea6396b8051e9243120a5cae5dde Stored in directory: /tmp/pip-ephem-wheel-cache-wlywjaz5/wheels/7c/35/80/e946b22a081210c6642e607ed65b2a5b9a4d9259695ee2caf5 error: subprocess-exited-with-error

Γ— Building wheel for tokenizers (pyproject.toml) did not run successfully. β”‚ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. Building wheel for tokenizers (pyproject.toml) ... error ERROR: Failed building wheel for tokenizers Building wheel for sacremoses (setup.py) ... done Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895241 sha256=099fd152876aa843c9f04a284c7f7c9260d266b181e672796d1619a0f7e2be76 Stored in directory: /root/.cache/pip/wheels/00/24/97/a2ea5324f36bc626e1ea0267f33db6aa80d157ee977e9e42fb Successfully built transformers sacremoses Failed to build tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

Did you get a solution for this?

Downgrading to python 3.10 worked for me (Ubuntu 20.04, tokenizers 0.13.3).

Yes, we were a bit slow to ship binaries for py311 and more