Riccorl / transformer-srl

Reimplementation of a BERT based model (Shi et al, 2019), currently the state-of-the-art for English SRL. This model implements also predicate disambiguation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error Loading State Script

DavidSorge opened this issue · comments

I pip-installed in a fresh conda environment (on a CentOS 7 system), downloaded the pre-trained model, and followed the instructions in the readme file.

I got these results:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-3-bfc9a5474abe> in <module>
      1 srl_model_path = Path('resources', 'srl_bert_base_conll2012.tar.gz')
----> 2 predictor = predictors.SrlTransformersPredictor.from_path(srl_model_path, 'transformer-srl')

~/.conda/envs/pogrep4/lib/python3.9/site-packages/transformer_srl/predictors.py in from_path(cls, archive_path, predictor_name, cuda_device, dataset_reader_to_load, frozen, import_plugins, language, restrict_frames, restrict_roles)
    157             plugins.import_plugins()
    158         return SrlTransformersPredictor.from_archive(
--> 159             load_archive(archive_path, cuda_device=cuda_device),
    160             predictor_name,
    161             dataset_reader_to_load=dataset_reader_to_load,

~/.conda/envs/pogrep4/lib/python3.9/site-packages/allennlp/models/archival.py in load_archive(archive_file, cuda_device, overrides, weights_file)
    206             config.duplicate(), serialization_dir
    207         )
--> 208         model = _load_model(config.duplicate(), weights_path, serialization_dir, cuda_device)
    209     finally:
    210         if tempdir is not None:

~/.conda/envs/pogrep4/lib/python3.9/site-packages/allennlp/models/archival.py in _load_model(config, weights_path, serialization_dir, cuda_device)
    240 
    241 def _load_model(config, weights_path, serialization_dir, cuda_device):
--> 242     return Model.load(
    243         config,
    244         weights_file=weights_path,

~/.conda/envs/pogrep4/lib/python3.9/site-packages/allennlp/models/model.py in load(cls, config, serialization_dir, weights_file, cuda_device)
    404             # get_model_class method, that recurses whenever it finds a from_archive model type.
    405             model_class = Model
--> 406         return model_class._load(config, serialization_dir, weights_file, cuda_device)
    407 
    408     def extend_embedder_vocab(self, embedding_sources_mapping: Dict[str, str] = None) -> None:

~/.conda/envs/pogrep4/lib/python3.9/site-packages/allennlp/models/model.py in _load(cls, config, serialization_dir, weights_file, cuda_device)
    346 
    347         if unexpected_keys or missing_keys:
--> 348             raise RuntimeError(
    349                 f"Error loading state dict for {model.__class__.__name__}\n\t"
    350                 f"Missing keys: {missing_keys}\n\t"

RuntimeError: Error loading state dict for TransformerSrlSpan
	Missing keys: ['transformer.embeddings.position_ids']
	Unexpected keys: []

Did you install the latest "stable" release (2.5)?

Yes, I just retried and verified.

# packages in environment at /home/username/.conda/envs/pogrep4:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
allennlp                  2.0.1                    pypi_0    pypi
allennlp-models           2.0.1                    pypi_0    pypi
attrs                     20.3.0                   pypi_0    pypi
blis                      0.7.4                    pypi_0    pypi
boto3                     1.17.36                  pypi_0    pypi
botocore                  1.20.36                  pypi_0    pypi
ca-certificates           2020.12.5            ha878542_0    conda-forge
catalogue                 1.0.0                    pypi_0    pypi
certifi                   2020.12.5        py39hf3d152e_1    conda-forge
chardet                   4.0.0                    pypi_0    pypi
click                     7.1.2                    pypi_0    pypi
conllu                    4.3                      pypi_0    pypi
cymem                     2.0.5                    pypi_0    pypi
filelock                  3.0.12                   pypi_0    pypi
ftfy                      5.9                      pypi_0    pypi
h5py                      3.2.1                    pypi_0    pypi
idna                      2.10                     pypi_0    pypi
iniconfig                 1.1.1                    pypi_0    pypi
jmespath                  0.10.0                   pypi_0    pypi
joblib                    1.0.1                    pypi_0    pypi
jsonnet                   0.17.0                   pypi_0    pypi
jsonpickle                2.0.0                    pypi_0    pypi
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
libgfortran-ng            9.3.0               hff62375_18    conda-forge
libgfortran5              9.3.0               hff62375_18    conda-forge
libgomp                   9.3.0               h2828fa1_18    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
lmdb                      1.1.1                    pypi_0    pypi
more-itertools            8.7.0                    pypi_0    pypi
murmurhash                1.0.5                    pypi_0    pypi
ncurses                   6.2                  h58526e2_4    conda-forge
nltk                      3.5                      pypi_0    pypi
numpy                     1.20.1           py39hdbf815f_0    conda-forge
openssl                   1.1.1j               h7f98852_0    conda-forge
overrides                 3.1.0                    pypi_0    pypi
packaging                 20.9                     pypi_0    pypi
pandas                    1.2.3            py39hde0f152_0    conda-forge
pillow                    8.1.2                    pypi_0    pypi
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
plac                      1.1.3                    pypi_0    pypi
pluggy                    0.13.1                   pypi_0    pypi
preshed                   3.0.5                    pypi_0    pypi
protobuf                  3.15.6                   pypi_0    pypi
py                        1.10.0                   pypi_0    pypi
py-rouge                  1.1                      pypi_0    pypi
pyparsing                 2.4.7                    pypi_0    pypi
pytest                    6.2.2                    pypi_0    pypi
python                    3.9.2           hffdb5ce_0_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.9                      1_cp39    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
regex                     2021.3.17                pypi_0    pypi
requests                  2.25.1                   pypi_0    pypi
s3transfer                0.3.6                    pypi_0    pypi
sacremoses                0.0.43                   pypi_0    pypi
scikit-learn              0.24.1                   pypi_0    pypi
scipy                     1.6.2                    pypi_0    pypi
sentencepiece             0.1.95                   pypi_0    pypi
setuptools                49.6.0           py39hf3d152e_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
spacy                     2.3.5                    pypi_0    pypi
sqlite                    3.35.2               h74cdb3f_0    conda-forge
srsly                     1.0.5                    pypi_0    pypi
tensorboardx              2.1                      pypi_0    pypi
thinc                     7.4.5                    pypi_0    pypi
threadpoolctl             2.1.0                    pypi_0    pypi
tk                        8.6.10               h21135ba_1    conda-forge
tokenizers                0.9.4                    pypi_0    pypi
toml                      0.10.2                   pypi_0    pypi
torch                     1.7.1                    pypi_0    pypi
torchvision               0.8.2                    pypi_0    pypi
tqdm                      4.59.0                   pypi_0    pypi
transformer-srl           2.5                      pypi_0    pypi
transformers              4.2.2                    pypi_0    pypi
typing-extensions         3.7.4.3                  pypi_0    pypi
tzdata                    2021a                he74cb21_0    conda-forge
urllib3                   1.26.4                   pypi_0    pypi
wasabi                    0.8.2                    pypi_0    pypi
wcwidth                   0.2.5                    pypi_0    pypi
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
word2number               1.1                      pypi_0    pypi
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge

That is probably a dependency problem with transformers library. Try to instal an older version of transformer-srl, for example 2.4.6, it is the one used to train that model.

Thanks for the suggestion. I just tried pip install transformer-srl==2.4.6, but wasn't able to pip install it:

Building wheels for collected packages: tokenizers
  Building wheel for tokenizers (PEP 517): started
  Building wheel for tokenizers (PEP 517): finished with status 'error'
Failed to build tokenizers

Pip subprocess error:
    ERROR: Command errored out with exit status 1:
     command: /home/dsorge/.conda/envs/pogrep4/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-gt27wgpa/sentencepiece_88c8f62ee5a347fdbe6614963dca9cc3/setup.py'"'"'; __file__='"'"'/tmp/pip-install-gt27wgpa/sentencepiece_88c8f62ee5a347fdbe6614963dca9cc3/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-55eewitn
         cwd: /tmp/pip-install-gt27wgpa/sentencepiece_88c8f62ee5a347fdbe6614963dca9cc3/
    Complete output (5 lines):
    Package sentencepiece was not found in the pkg-config search path.
    Perhaps you should add the directory containing `sentencepiece.pc'
    to the PKG_CONFIG_PATH environment variable
    No package 'sentencepiece' found
    Failed to find sentencepiece pkgconfig
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/ba/f6/520b56e5977f62aee48833da8b4ff2fdc2b10ebfa0dd78556b1d707d4086/sentencepiece-0.1.91.tar.gz#sha256=f9700cf607ea064d9fad34c751fbf49953dcc56fe68c54b277481aa0aec5c18f (from https://pypi.org/simple/sentencepiece/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /home/dsorge/.conda/envs/pogrep4/bin/python /home/dsorge/.conda/envs/pogrep4/lib/python3.9/site-packages/pip/_vendor/pep517/_in_process.py build_wheel /tmp/tmp7xtprer2
       cwd: /tmp/pip-install-gt27wgpa/tokenizers_f82fafc07f3146578a6ff5b934d9f1cc
  Complete output (47 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib
  creating build/lib/tokenizers
  copying py_src/tokenizers/__init__.py -> build/lib/tokenizers
  creating build/lib/tokenizers/models
  copying py_src/tokenizers/models/__init__.py -> build/lib/tokenizers/models
  creating build/lib/tokenizers/decoders
  copying py_src/tokenizers/decoders/__init__.py -> build/lib/tokenizers/decoders
  creating build/lib/tokenizers/normalizers
  copying py_src/tokenizers/normalizers/__init__.py -> build/lib/tokenizers/normalizers
  creating build/lib/tokenizers/pre_tokenizers
  copying py_src/tokenizers/pre_tokenizers/__init__.py -> build/lib/tokenizers/pre_tokenizers
  creating build/lib/tokenizers/processors
  copying py_src/tokenizers/processors/__init__.py -> build/lib/tokenizers/processors
  creating build/lib/tokenizers/trainers
  copying py_src/tokenizers/trainers/__init__.py -> build/lib/tokenizers/trainers
  creating build/lib/tokenizers/implementations
  copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib/tokenizers/implementations
  copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib/tokenizers/implementations
  copying py_src/tokenizers/implementations/__init__.py -> build/lib/tokenizers/implementations
  copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib/tokenizers/implementations
  copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib/tokenizers/implementations
  copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib/tokenizers/implementations
  copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib/tokenizers/implementations
  copying py_src/tokenizers/__init__.pyi -> build/lib/tokenizers
  copying py_src/tokenizers/models/__init__.pyi -> build/lib/tokenizers/models
  copying py_src/tokenizers/decoders/__init__.pyi -> build/lib/tokenizers/decoders
  copying py_src/tokenizers/normalizers/__init__.pyi -> build/lib/tokenizers/normalizers
  copying py_src/tokenizers/pre_tokenizers/__init__.pyi -> build/lib/tokenizers/pre_tokenizers
  copying py_src/tokenizers/processors/__init__.pyi -> build/lib/tokenizers/processors
  copying py_src/tokenizers/trainers/__init__.pyi -> build/lib/tokenizers/trainers
  running build_ext
  running build_rust
  error: can't find Rust compiler
  
  If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
  
  To update pip, run:
  
      pip install --upgrade pip
  
  and then retry package installation.
  
  If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
  ----------------------------------------
  ERROR: Failed building wheel for tokenizers
ERROR: Could not build wheels for tokenizers which use PEP 517 and cannot be installed directly

It seems like you cannot build tokenizers library from huggingface due to missing Rust. Unfortunately I cannot help with that. Things that I would try are:

  • pip install --upgrade transformer-srl==2.4.6

  • try to install tokenizers before install transformer-srl, the one that is compatible with transformer 3.1

Tried a few things:

  1. used conda install transformers==3.1 in the environment, then pip install transformer-srl==2.4.6
    • same result as original:
RuntimeError: Error loading state dict for TransformerSrlSpan
        Missing keys: ['transformer.embeddings.position_ids']
        Unexpected keys: []
  1. used pip install --upgrade transformer-srl==2.4.6:

    • No dice; same result as when I tried pip install transformer-srl==2.4.6
  2. used pip install tokenizers==0.10.1, then pip install --upgrade transformer-srl==2.4.6

    • Strangely, same as above. Despite successfully pip installing tokenizers, the transformer-srl install still hung at tokenizers
    • Not sure if it remains worth the time to keep bug hunting -- I've found another SRL workaround for now, so this is not urgent to me. Still,I thought you would want to know, in case this means that the requirements.txt or other dependency-management system needs to be updated.

In either case, thank you for your help!

Thank you for the report. I will try to use it in a clean environment as soon as possible and I will report my findings

Ok so that's what I tried.

  1. clean environment with conda
conda create -n srl-test python=3.6
conda activate srl-test
  1. Install transformer-srl==2.4.6
pip install transformer-srl==2.4.6
  1. Downloaded the model from the README
  2. Run a test prediction
echo '{"sentence": "Did Uriah honestly think he could beat the game in under three hours?"}' | \
allennlp predict path/to/srl_bert_base_conll2012.tar.gz - --include-package transformer_srl

This workflow worked. Here the dependencies that I have in a new environment

#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
allennlp                  1.2.2                    pypi_0    pypi
allennlp-models           1.2.2                    pypi_0    pypi
attrs                     20.3.0                   pypi_0    pypi
blis                      0.7.4                    pypi_0    pypi
boto3                     1.17.40                  pypi_0    pypi
botocore                  1.20.40                  pypi_0    pypi
ca-certificates           2021.1.19            h06a4308_1
cached-property           1.5.2                    pypi_0    pypi
catalogue                 1.0.0                    pypi_0    pypi
certifi                   2020.12.5        py36h06a4308_0
chardet                   4.0.0                    pypi_0    pypi
click                     7.1.2                    pypi_0    pypi
conllu                    4.2.1                    pypi_0    pypi
cymem                     2.0.5                    pypi_0    pypi
dataclasses               0.8                      pypi_0    pypi
filelock                  3.0.12                   pypi_0    pypi
ftfy                      5.9                      pypi_0    pypi
h5py                      3.1.0                    pypi_0    pypi
idna                      2.10                     pypi_0    pypi
importlib-metadata        3.10.0                   pypi_0    pypi
iniconfig                 1.1.1                    pypi_0    pypi
jmespath                  0.10.0                   pypi_0    pypi
joblib                    1.0.1                    pypi_0    pypi
jsonnet                   0.17.0                   pypi_0    pypi
jsonpickle                2.0.0                    pypi_0    pypi
ld_impl_linux-64          2.33.1               h53a641e_7
libffi                    3.3                  he6710b0_2
libgcc-ng                 9.1.0                hdf63c60_0
libstdcxx-ng              9.1.0                hdf63c60_0
murmurhash                1.0.5                    pypi_0    pypi
ncurses                   6.2                  he6710b0_1
nltk                      3.5                      pypi_0    pypi
numpy                     1.19.5                   pypi_0    pypi
openssl                   1.1.1k               h27cfd23_0
overrides                 3.1.0                    pypi_0    pypi
packaging                 20.9                     pypi_0    pypi
pip                       21.0.1           py36h06a4308_0
plac                      1.1.3                    pypi_0    pypi
pluggy                    0.13.1                   pypi_0    pypi
preshed                   3.0.5                    pypi_0    pypi
protobuf                  3.15.6                   pypi_0    pypi
py                        1.10.0                   pypi_0    pypi
py-rouge                  1.1                      pypi_0    pypi
pyparsing                 2.4.7                    pypi_0    pypi
pytest                    6.2.2                    pypi_0    pypi
python                    3.6.13               hdb3f193_0
python-dateutil           2.8.1                    pypi_0    pypi
readline                  8.1                  h27cfd23_0
regex                     2021.3.17                pypi_0    pypi
requests                  2.25.1                   pypi_0    pypi
s3transfer                0.3.6                    pypi_0    pypi
sacremoses                0.0.43                   pypi_0    pypi
scikit-learn              0.24.1                   pypi_0    pypi
scipy                     1.5.4                    pypi_0    pypi
sentencepiece             0.1.91                   pypi_0    pypi
setuptools                52.0.0           py36h06a4308_0
six                       1.15.0                   pypi_0    pypi
spacy                     2.3.5                    pypi_0    pypi
sqlite                    3.35.2               hdfb4753_0
srsly                     1.0.5                    pypi_0    pypi
tensorboardx              2.1                      pypi_0    pypi
thinc                     7.4.5                    pypi_0    pypi
threadpoolctl             2.1.0                    pypi_0    pypi
tk                        8.6.10               hbc83047_0
tokenizers                0.9.3                    pypi_0    pypi
toml                      0.10.2                   pypi_0    pypi
torch                     1.7.1                    pypi_0    pypi
tqdm                      4.59.0                   pypi_0    pypi
transformer-srl           2.4.6                    pypi_0    pypi
transformers              3.5.1                    pypi_0    pypi
typing-extensions         3.7.4.3                  pypi_0    pypi
urllib3                   1.26.4                   pypi_0    pypi
wasabi                    0.8.2                    pypi_0    pypi
wcwidth                   0.2.5                    pypi_0    pypi
wheel                     0.36.2             pyhd3eb1b0_0
word2number               1.1                      pypi_0    pypi
xz                        5.2.5                h7b6447c_0
zipp                      3.4.1                    pypi_0    pypi
zlib                      1.2.11               h7b6447c_3