[Installation] Resolving dependency chain due to the latest Transformer version

Question

[Installation] Resolving dependency chain due to the latest Transformer version

sameeravithana opened this issue 3 years ago · comments

Sameera Horawalavithana commented 3 years ago

The current *-gpu.yaml failed due to issues in the lib dependency chain introduced by the latest Transformer (4.12) version. For an example, there exists dependency issues with Transformer and versions of huggingface_hub, datasets, etc. libs. Which transformer version should we used to have smoothed installation?

J38 · Answer 1 · Fri Oct 29 2021 05:54:39 GMT+0800 (China Standard Time)

Hi I think this should be fixed in dev.

I have been running things with:

datasets. 1.4.0
huggingface-hub. 0.0.19
transformers  4.12.0.dev0

The latest version of our code is built on top of recent transformers versions.

When I get a chance I'll rebuild the environments from scratch to verify these issues are all gone, thanks for letting me know.

J38 · Answer 2 · Fri Oct 29 2021 05:54:52 GMT+0800 (China Standard Time)

If you have any other issues please let me know and can help address.

Sameera Horawalavithana · Answer 3 · Sat Oct 30 2021 03:46:55 GMT+0800 (China Standard Time)

The conflict is caused by:

The user requested huggingface-hub==0.0.19
  transformers 4.13.0.dev0 depends on huggingface-hub>=0.0.17
  datasets 1.4.0 depends on huggingface-hub==0.0.2

Sameera Horawalavithana · Answer 4 · Sat Oct 30 2021 03:51:16 GMT+0800 (China Standard Time)

Having datasets 1.14.0, following conflict is raised.

The user requested tqdm==4.49.0
    experiment-impact-tracker 0.1.9 depends on tqdm
    transformers 4.13.0.dev0 depends on tqdm>=4.27
    datasets 1.14.0 depends on tqdm>=4.62.1

J38 · Answer 5 · Sat Oct 30 2021 08:29:18 GMT+0800 (China Standard Time)

Hi I'll try rebuilding the file from scratch today ... I think just changing the tqdm requirement in the file will fix this.

Sameera Horawalavithana · Answer 6 · Sat Oct 30 2021 08:37:56 GMT+0800 (China Standard Time)

I updated tqdm to 4.62.1, it takes more than two hours for pip dependency resolver. another conflict;

ERROR: Cannot install fsspec==0.8.7, fsspec[http]==2021.10.0, fsspec[http]==2021.10.1, fsspec[http]==2021.5.0, fsspec[http]==2021.6.0, fsspec[http]==2021.6.1, fsspec[http]==2021.7.0, fsspec[http]==2021.8.1 and fsspec[http]==2021.9.0 because these package versions have conflicting dependencies.
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

J38 · Answer 7 · Sat Oct 30 2021 17:25:19 GMT+0800 (China Standard Time)

I'll either create a new environment.yaml file or just stop recommending usage of those, but this command should initialize your environment very quickly, (assuming you are using CUDA==10.2, change depending on what CUDA you have installed) ... I've honestly found rebuilding from a conda .yaml to hang for some reason, so just running these two commands should give you a properly set up library, and it shouldn't take more than a couple of minutes to work.

I will update the official docs to this this weekend hopefully:

pip install torch transformers datasets huggingface-hub deepspeed jsonlines quinine wandb
conda install cudatoolkit=10.2

J38 · Answer 8 · Sat Oct 30 2021 17:25:55 GMT+0800 (China Standard Time)

Note I ran these commands in a fresh conda environment with python 3.8.8

J38 · Answer 9 · Sat Oct 30 2021 17:29:52 GMT+0800 (China Standard Time)

Also be aware you have to make sure CUDA and torch are consistent ... so pip install torch is just going to grab the latest PyTorch which I got working with CUDA 10.2

J38 · Answer 10 · Sat Oct 30 2021 17:58:58 GMT+0800 (China Standard Time)

So I think the new official advice for environment set up will be:

conda create -n mistral python=3.8.8
conda activate mistral
conda install cudatoolkit=10.2
pip install torch transformers datasets huggingface-hub deepspeed jsonlines quinine wandb

J38 · Answer 11 · Sat Oct 30 2021 18:20:36 GMT+0800 (China Standard Time)

Depending on the PyTorch, CUDA, and Python you want to use, you can change that in the above command, but as noted, make sure PyTorch is compatible with CUDA.

Sameera Horawalavithana · Answer 12 · Mon Nov 01 2021 09:55:52 GMT+0800 (China Standard Time)

Thank you. This works, I was able to run the default runs without any issue.