Is there a pre-trained model that one can run?

Question

Is there a pre-trained model that one can run?

logicReasoner opened this issue 4 years ago · comments

@Riccorl Thanks for your continued effort on SRL on top of BERT!
I've managed to run the SRL model as described in https://demo.allennlp.org/semantic-role-labeling
How can one run your model locally? Can it run on CPU only?

Riccardo Orlando · Answer 1 · Mon Sep 14 2020 07:21:56 GMT+0800 (China Standard Time)

Hi!

Unfortunately, I don't have a pretrained model with PropBank inventory. You have to train it :(

However, train should be easy. It can run on CPU, yes, but it's really slow. You can use Colab to train it with a GPU, for free. To run it, you can clone this repo and run

export SRL_TRAIN_DATA_PATH="path/to/train"
export SRL_VALIDATION_DATA_PATH="path/to/development"
allennlp train training_config/bert_base_span.jsonnet -s path/to/model --include-package transformer_srl

where training_config/bert_base_span.jsonnet is the config file that I usually use.

logicReasoner · Answer 2 · Mon Sep 14 2020 13:13:30 GMT+0800 (China Standard Time)

@Riccorl
Thanks for your quick reply!
I see. Can you provide some sample input / output pairs generated by this project so that I can see if the format is suitable for my needs?

BTW, where can I download the Prop Bank inventory that you mentioned?

Riccardo Orlando · Answer 3 · Mon Sep 14 2020 22:22:53 GMT+0800 (China Standard Time)

The output is a dictionary that contains the following keys:

For instance, the sentence

The keys, which were needed to access the building, were locked in the car.

this will be output

"verb": needed # the predicate token
"description":   [ARG1: The keys] , [R-ARG1: which] were [V: needed] [ARGM-PRP: to access the building] , were locked in the car . # the sentence with the predicate and args annotaded, 
"tags": [B-ARG1, I-ARG1, O, O ...] # list of args,
"frame": need.01 # the predicate label

This is the piece of code about the output.

BTW, where can I download the Prop Bank inventory that you mentioned?

You need the conll 2012 dataset that is not free, so I cannot distribute it. You can find more information here.

logicReasoner · Answer 4 · Tue Sep 15 2020 01:22:41 GMT+0800 (China Standard Time)

Obtaining conll 2012 is unfortunately outside my financial capabilities... ☹️ BTW if you train a model on colab for free using conll 2012, you can then distribute the trained model without violating the conll license. This is how Allen NLP do it.

…

On Mon, Sep 14, 2020, 17:23 Riccardo Orlando ***@***.***> wrote: The output is a dictionary that contains the following keys: For instance, the sentence The keys, which were needed to access the building, were locked in the car. this will be output "verb": needed # the predicate token "description": [ARG1: The keys] , [R-ARG1: which] were [V: needed] [ARGM-PRP: to access the building] , were locked in the car . # the sentence with the predicate and args annotaded, "tags": [B-ARG1, I-ARG1, O, O ...]# list of args, "frame": need.01 # the predicate label BTW, where can I download the Prop Bank inventory that you mentioned? You need the conll 2012 dataset that is not free, so I cannot distribute it. You can find more information here <https://conll.cemantix.org/2012/data.html>. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQBYKS25DRS22BHMRGMGJRDSFYRM7ANCNFSM4RKY5YKA> .

Riccardo Orlando · Answer 5 · Tue Sep 15 2020 01:34:34 GMT+0800 (China Standard Time)

Yeah, I will try to train it asap and publish it :)

logicReasoner · Answer 6 · Tue Sep 15 2020 02:21:46 GMT+0800 (China Standard Time)

@Riccorl you're awesome! :)

Riccardo Orlando · Answer 7 · Thu Sep 17 2020 06:06:47 GMT+0800 (China Standard Time)

I uploaded a model here. It's based on BERT base (not large), the F1 scores are 86 and 95.5, respectively for argument and predicate disambiguation/identification (on the dev set).

logicReasoner · Answer 8 · Thu Sep 17 2020 15:11:10 GMT+0800 (China Standard Time)

@Riccorl so I eagerly tried your model srl_bert_base_conll2012.tar.gz as a drop-in replacement to allenNlp's bert-base-srl-2020.03.24.tar.gz.

from allennlp.predictors.predictor import Predictor
predictor = Predictor.from_path("/home/user/srl_bert_base_conll2012.tar.gz")
predictor.predict(
  sentence="Did Uriah honestly think he could beat the game in under three hours?"
)

but I got the following error while loading it as a predictor in Python 3.6:

I0917 06:58:13.354965 139747665430336 archival.py:164] loading archive file ~/srl_bert_base_conll2012.tar.gz from cache at /home/user/srl_bert_base_conll2012.tar.gz
2020-09-17 06:58:13 INFO     allennlp.models.archival  - loading archive file ~/srl_bert_base_conll2012.tar.gz from cache at /home/user/srl_bert_base_conll2012.tar.gz
I0917 06:58:13.356019 139747665430336 archival.py:171] extracting archive file /home/user/srl_bert_base_conll2012.tar.gz to temp dir /tmp/tmpcmoyc42s
2020-09-17 06:58:13 INFO     allennlp.models.archival  - extracting archive file /home/user/srl_bert_base_conll2012.tar.gz to temp dir /tmp/tmpcmoyc42s
Traceback (most recent call last):
  File "/home/user/.local/bin/project", line 10, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.6/site-packages/project/__main__.py", line 76, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/home/user/.local/lib/python3.6/site-packages/project/cli/run.py", line 88, in run
    project.run(**vars(args))
  File "/home/user/.local/lib/python3.6/site-packages/project/run.py", line 33, in run
    import project.core.run
  File "/home/user/.local/lib/python3.6/site-packages/project/core/run.py", line 22, in <module>
    from project.server import add_root_route
  File "/home/user/.local/lib/python3.6/site-packages/project/server.py", line 70, in <module>
    srlPredictor = Predictor.from_path("~/srl_bert_base_conll2012.tar.gz")
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/predictors/predictor.py", line 275, in from_path
    load_archive(archive_path, cuda_device=cuda_device),
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/models/archival.py", line 197, in load_archive
    opt_level=opt_level,
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/models/model.py", line 391, in load
    model_class: Type[Model] = cls.by_name(model_type)  # type: ignore
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/common/registrable.py", line 137, in by_name
    subclass, constructor = cls.resolve_class_name(name)
  File "/home/user/.local/lib/python3.6/site-packages/allennlp/common/registrable.py", line 185, in resolve_class_name
    f"{name} is not a registered name for {cls.__name__}. "
allennlp.common.checks.ConfigurationError: transformer_srl_span is not a registered name for Model. You probably need to use the --include-package flag to load your custom code. Alternatively, you can specify your choices using fully-qualified paths, e.g. {"model": "my_module.models.MyModel"} in which case they will be automatically imported correctly.
I0917 06:58:16.335596 139747665430336 archival.py:205] removing temporary unarchived model dir at /tmp/tmpcmoyc42s
2020-09-17 06:58:16 INFO     allennlp.models.archival  - removing temporary unarchived model dir at /tmp/tmpcmoyc42s

It has something to do with transformer_srl_span is not a registered name for Model so I guess I am missing some specific configuration setting?

Riccardo Orlando · Answer 9 · Thu Sep 17 2020 15:17:17 GMT+0800 (China Standard Time)

You should import models, dataset_readers and predictors from transformer_srl even if you don't explicitly use them. It's equivalent to add --include-package from cli.

logicReasoner · Answer 10 · Thu Sep 17 2020 19:51:40 GMT+0800 (China Standard Time)

Thanks to your tip, I've managed to resolve that particular error. Now, python cannot find the bert-base-cased model due to the firewall configuration which does not allow remote connections. I've manually downloaded bert-base-cased-pytorch_model.bin and bert-base-cased-pytorch_model.json and managed to run the model.
There are some warnings about some missing bert-base-cased related files:

bert-base-cased/added_tokens.json. We won't load it.
2020-09-17 12:26:49 INFO     transformers.tokenization_utils  - Didn't find file /home/user/bert-base-cased/added_tokens.json. We won't load it.
I0917 12:26:49.964895 139946394539840 tokenization_utils.py:965] Didn't find file /home/user/bert-base-cased/special_tokens_map.json. We won't load it.
2020-09-17 12:26:49 INFO     transformers.tokenization_utils  - Didn't find file /home/user/bert-base-cased/special_tokens_map.json. We won't load it.
I0917 12:26:49.964944 139946394539840 tokenization_utils.py:965] Didn't find file /home/user/bert-base-cased/tokenizer_config.json. We won't load it.
2020-09-17 12:26:49 INFO     transformers.tokenization_utils  - Didn't find file /home/user/bert-base-cased/tokenizer_config.json

are those needed?

Riccardo Orlando · Answer 11 · Thu Sep 17 2020 21:38:50 GMT+0800 (China Standard Time)

I don't know for sure, because the model uses default configs from Huggingface. I guess it doesn't matter (?) since they are only warnings.

logicReasoner · Answer 12 · Mon Sep 21 2020 15:56:42 GMT+0800 (China Standard Time)

@Riccorl After having played with the model for a bit, it seems to do a really decent job. So what are the next steps in raising the bar in accuracy even higher? Would using bert-large-cased make a significant difference or perhaps switching to another one of the HuggingFace models?

Thanks and keep up the good work!

Riccardo Orlando · Answer 13 · Mon Sep 21 2020 16:22:21 GMT+0800 (China Standard Time)

bert-large-cased can indeed improve the results, but the improvements are marginal (as you can see in the paper here on page 5). I guess that switching model has a better chance to improve the results. As for now, only models that accept 1 as token_type_id works, I tried to generalize more, but it didn't work. I hope I can make it work really soon because I really much need it 🤣

logicReasoner · Answer 14 · Mon Sep 21 2020 16:23:43 GMT+0800 (China Standard Time)

Don't worry! You'll make it work eventually! We have faith in you ;)

Oana Ignat · Answer 15 · Sat Nov 07 2020 12:58:36 GMT+0800 (China Standard Time)

Hello, I tried the ideas from above and got the following errors. Do you know why they might be and how to solve them? Thank you so much!

from transformer_srl import dataset_readers, models, predictors from allennlp.predictors.predictor import Predictor predictor = Predictor.from_path("data/srl_bert_base_conll2012.tar.gz") predictor.predict( sentence="Did Uriah honestly think he could beat the game in under three hours?" )

File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/archival.py", line 208, in load_archive
model = _load_model(config.duplicate(), weights_path, serialization_dir, cuda_device)
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/archival.py", line 246, in _load_model
cuda_device=cuda_device,
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/model.py", line 406, in load
return model_class._load(config, serialization_dir, weights_file, cuda_device)
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/model.py", line 326, in _load
missing_keys, unexpected_keys = model.load_state_dict(model_state, strict=False)
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TransformerSrlSpan:
size mismatch for frame_projection_layer.weight: copying a param with shape torch.Size([5497, 768]) from checkpoint, the shape in current model is torch.Size([5929, 768]).
size mismatch for frame_projection_layer.bias: copying a param with shape torch.Size([5497]) from checkpoint, the shape in current model is torch.Size([5929]).

Riccardo Orlando · Answer 16 · Sat Nov 07 2020 22:04:00 GMT+0800 (China Standard Time)

Yeah, there is a piece of code in 2.4 that breaks that model. If you try with pip install transformer-srl==2.3.1 it should work. Let me know!

Riccardo Orlando · Answer 17 · Wed Nov 18 2020 20:58:35 GMT+0800 (China Standard Time)

@OanaIgnat I uploaded a new version of the pretrained model compatible with 2.4.4.. It should fix your problem. Let me know!