Is there a pre-trained model that one can run?
logicReasoner opened this issue · comments
@Riccorl Thanks for your continued effort on SRL on top of BERT!
I've managed to run the SRL model as described in https://demo.allennlp.org/semantic-role-labeling
How can one run your model locally? Can it run on CPU only?
Hi!
Unfortunately, I don't have a pretrained model with PropBank inventory. You have to train it :(
However, train should be easy. It can run on CPU, yes, but it's really slow. You can use Colab to train it with a GPU, for free. To run it, you can clone this repo and run
export SRL_TRAIN_DATA_PATH="path/to/train"
export SRL_VALIDATION_DATA_PATH="path/to/development"
allennlp train training_config/bert_base_span.jsonnet -s path/to/model --include-package transformer_srl
where training_config/bert_base_span.jsonnet
is the config file that I usually use.
@Riccorl
Thanks for your quick reply!
I see. Can you provide some sample input / output pairs generated by this project so that I can see if the format is suitable for my needs?
BTW, where can I download the Prop Bank inventory that you mentioned?
The output is a dictionary that contains the following keys:
For instance, the sentence
The keys, which were needed to access the building, were locked in the car.
this will be output
"verb": needed # the predicate token
"description": [ARG1: The keys] , [R-ARG1: which] were [V: needed] [ARGM-PRP: to access the building] , were locked in the car . # the sentence with the predicate and args annotaded,
"tags": [B-ARG1, I-ARG1, O, O ...] # list of args,
"frame": need.01 # the predicate label
This is the piece of code about the output.
BTW, where can I download the Prop Bank inventory that you mentioned?
You need the conll 2012 dataset that is not free, so I cannot distribute it. You can find more information here.
Yeah, I will try to train it asap and publish it :)
@Riccorl you're awesome! :)
I uploaded a model here. It's based on BERT base (not large), the F1 scores are 86 and 95.5, respectively for argument and predicate disambiguation/identification (on the dev set).
@Riccorl so I eagerly tried your model srl_bert_base_conll2012.tar.gz
as a drop-in replacement to allenNlp's bert-base-srl-2020.03.24.tar.gz
.
from allennlp.predictors.predictor import Predictor
predictor = Predictor.from_path("/home/user/srl_bert_base_conll2012.tar.gz")
predictor.predict(
sentence="Did Uriah honestly think he could beat the game in under three hours?"
)
but I got the following error while loading it as a predictor in Python 3.6:
I0917 06:58:13.354965 139747665430336 archival.py:164] loading archive file ~/srl_bert_base_conll2012.tar.gz from cache at /home/user/srl_bert_base_conll2012.tar.gz
2020-09-17 06:58:13 INFO allennlp.models.archival - loading archive file ~/srl_bert_base_conll2012.tar.gz from cache at /home/user/srl_bert_base_conll2012.tar.gz
I0917 06:58:13.356019 139747665430336 archival.py:171] extracting archive file /home/user/srl_bert_base_conll2012.tar.gz to temp dir /tmp/tmpcmoyc42s
2020-09-17 06:58:13 INFO allennlp.models.archival - extracting archive file /home/user/srl_bert_base_conll2012.tar.gz to temp dir /tmp/tmpcmoyc42s
Traceback (most recent call last):
File "/home/user/.local/bin/project", line 10, in <module>
sys.exit(main())
File "/home/user/.local/lib/python3.6/site-packages/project/__main__.py", line 76, in main
cmdline_arguments.func(cmdline_arguments)
File "/home/user/.local/lib/python3.6/site-packages/project/cli/run.py", line 88, in run
project.run(**vars(args))
File "/home/user/.local/lib/python3.6/site-packages/project/run.py", line 33, in run
import project.core.run
File "/home/user/.local/lib/python3.6/site-packages/project/core/run.py", line 22, in <module>
from project.server import add_root_route
File "/home/user/.local/lib/python3.6/site-packages/project/server.py", line 70, in <module>
srlPredictor = Predictor.from_path("~/srl_bert_base_conll2012.tar.gz")
File "/home/user/.local/lib/python3.6/site-packages/allennlp/predictors/predictor.py", line 275, in from_path
load_archive(archive_path, cuda_device=cuda_device),
File "/home/user/.local/lib/python3.6/site-packages/allennlp/models/archival.py", line 197, in load_archive
opt_level=opt_level,
File "/home/user/.local/lib/python3.6/site-packages/allennlp/models/model.py", line 391, in load
model_class: Type[Model] = cls.by_name(model_type) # type: ignore
File "/home/user/.local/lib/python3.6/site-packages/allennlp/common/registrable.py", line 137, in by_name
subclass, constructor = cls.resolve_class_name(name)
File "/home/user/.local/lib/python3.6/site-packages/allennlp/common/registrable.py", line 185, in resolve_class_name
f"{name} is not a registered name for {cls.__name__}. "
allennlp.common.checks.ConfigurationError: transformer_srl_span is not a registered name for Model. You probably need to use the --include-package flag to load your custom code. Alternatively, you can specify your choices using fully-qualified paths, e.g. {"model": "my_module.models.MyModel"} in which case they will be automatically imported correctly.
I0917 06:58:16.335596 139747665430336 archival.py:205] removing temporary unarchived model dir at /tmp/tmpcmoyc42s
2020-09-17 06:58:16 INFO allennlp.models.archival - removing temporary unarchived model dir at /tmp/tmpcmoyc42s
It has something to do with transformer_srl_span is not a registered name for Model
so I guess I am missing some specific configuration setting?
You should import models
, dataset_readers
and predictors
from transformer_srl
even if you don't explicitly use them. It's equivalent to add --include-package
from cli.
Thanks to your tip, I've managed to resolve that particular error. Now, python cannot find the bert-base-cased
model due to the firewall configuration which does not allow remote connections. I've manually downloaded bert-base-cased-pytorch_model.bin
and bert-base-cased-pytorch_model.json
and managed to run the model.
There are some warnings about some missing bert-base-cased related files:
bert-base-cased/added_tokens.json. We won't load it.
2020-09-17 12:26:49 INFO transformers.tokenization_utils - Didn't find file /home/user/bert-base-cased/added_tokens.json. We won't load it.
I0917 12:26:49.964895 139946394539840 tokenization_utils.py:965] Didn't find file /home/user/bert-base-cased/special_tokens_map.json. We won't load it.
2020-09-17 12:26:49 INFO transformers.tokenization_utils - Didn't find file /home/user/bert-base-cased/special_tokens_map.json. We won't load it.
I0917 12:26:49.964944 139946394539840 tokenization_utils.py:965] Didn't find file /home/user/bert-base-cased/tokenizer_config.json. We won't load it.
2020-09-17 12:26:49 INFO transformers.tokenization_utils - Didn't find file /home/user/bert-base-cased/tokenizer_config.json
are those needed?
I don't know for sure, because the model uses default configs from Huggingface. I guess it doesn't matter (?) since they are only warnings.
@Riccorl After having played with the model for a bit, it seems to do a really decent job. So what are the next steps in raising the bar in accuracy even higher? Would using bert-large-cased
make a significant difference or perhaps switching to another one of the HuggingFace models?
Thanks and keep up the good work!
bert-large-cased
can indeed improve the results, but the improvements are marginal (as you can see in the paper here on page 5). I guess that switching model has a better chance to improve the results. As for now, only models that accept 1
as token_type_id
works, I tried to generalize more, but it didn't work. I hope I can make it work really soon because I really much need it 🤣
Don't worry! You'll make it work eventually! We have faith in you ;)
Hello, I tried the ideas from above and got the following errors. Do you know why they might be and how to solve them? Thank you so much!
from transformer_srl import dataset_readers, models, predictors from allennlp.predictors.predictor import Predictor predictor = Predictor.from_path("data/srl_bert_base_conll2012.tar.gz") predictor.predict( sentence="Did Uriah honestly think he could beat the game in under three hours?" )
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/archival.py", line 208, in load_archive
model = _load_model(config.duplicate(), weights_path, serialization_dir, cuda_device)
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/archival.py", line 246, in _load_model
cuda_device=cuda_device,
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/model.py", line 406, in load
return model_class._load(config, serialization_dir, weights_file, cuda_device)
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/allennlp/models/model.py", line 326, in _load
missing_keys, unexpected_keys = model.load_state_dict(model_state, strict=False)
File "/home/user/.pyenv/versions/actions/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TransformerSrlSpan:
size mismatch for frame_projection_layer.weight: copying a param with shape torch.Size([5497, 768]) from checkpoint, the shape in current model is torch.Size([5929, 768]).
size mismatch for frame_projection_layer.bias: copying a param with shape torch.Size([5497]) from checkpoint, the shape in current model is torch.Size([5929]).
Yeah, there is a piece of code in 2.4 that breaks that model. If you try with pip install transformer-srl==2.3.1
it should work. Let me know!
@OanaIgnat I uploaded a new version of the pretrained model compatible with 2.4.4.
. It should fix your problem. Let me know!