stanford-crfm / mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Installation] Resolving dependency chain due to the latest Transformer version

sameeravithana opened this issue · comments

The current *-gpu.yaml failed due to issues in the lib dependency chain introduced by the latest Transformer (4.12) version. For an example, there exists dependency issues with Transformer and versions of huggingface_hub, datasets, etc. libs. Which transformer version should we used to have smoothed installation?

commented

Hi I think this should be fixed in dev.

I have been running things with:

datasets. 1.4.0
huggingface-hub. 0.0.19
transformers  4.12.0.dev0

The latest version of our code is built on top of recent transformers versions.

When I get a chance I'll rebuild the environments from scratch to verify these issues are all gone, thanks for letting me know.

commented

If you have any other issues please let me know and can help address.

The conflict is caused by:

The user requested huggingface-hub==0.0.19
  transformers 4.13.0.dev0 depends on huggingface-hub>=0.0.17
  datasets 1.4.0 depends on huggingface-hub==0.0.2

Having datasets 1.14.0, following conflict is raised.

The user requested tqdm==4.49.0
    experiment-impact-tracker 0.1.9 depends on tqdm
    transformers 4.13.0.dev0 depends on tqdm>=4.27
    datasets 1.14.0 depends on tqdm>=4.62.1
commented

Hi I'll try rebuilding the file from scratch today ... I think just changing the tqdm requirement in the file will fix this.

I updated tqdm to 4.62.1, it takes more than two hours for pip dependency resolver. another conflict;

ERROR: Cannot install fsspec==0.8.7, fsspec[http]==2021.10.0, fsspec[http]==2021.10.1, fsspec[http]==2021.5.0, fsspec[http]==2021.6.0, fsspec[http]==2021.6.1, fsspec[http]==2021.7.0, fsspec[http]==2021.8.1 and fsspec[http]==2021.9.0 because these package versions have conflicting dependencies.
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
commented

I'll either create a new environment.yaml file or just stop recommending usage of those, but this command should initialize your environment very quickly, (assuming you are using CUDA==10.2, change depending on what CUDA you have installed) ... I've honestly found rebuilding from a conda .yaml to hang for some reason, so just running these two commands should give you a properly set up library, and it shouldn't take more than a couple of minutes to work.

I will update the official docs to this this weekend hopefully:

pip install torch transformers datasets huggingface-hub deepspeed jsonlines quinine wandb
conda install cudatoolkit=10.2
commented

Note I ran these commands in a fresh conda environment with python 3.8.8

commented

Also be aware you have to make sure CUDA and torch are consistent ... so pip install torch is just going to grab the latest PyTorch which I got working with CUDA 10.2

commented

So I think the new official advice for environment set up will be:

conda create -n mistral python=3.8.8
conda activate mistral
conda install cudatoolkit=10.2
pip install torch transformers datasets huggingface-hub deepspeed jsonlines quinine wandb
commented

Depending on the PyTorch, CUDA, and Python you want to use, you can change that in the above command, but as noted, make sure PyTorch is compatible with CUDA.

Thank you. This works, I was able to run the default runs without any issue.