Problem when calling "export_bert.py"

Question

Problem when calling "export_bert.py"

freesunshine0316 opened this issue 4 years ago · comments

Linfeng Song commented 4 years ago

Hi,

I met several problems when executing this script. I'd be appreciate if anyone can help me out.

First, I saw "PTB_TOKEN_UNESCAPE" does not exist in "parse_nk.py", and I just commented that out.

Second, I observe the following error:

Traceback (most recent call last):
  File "export/export_bert.py", line 406, in <module>
    the_inp_tokens, the_inp_mask, the_out_chart, the_out_tags = make_network()
  File "export/export_bert.py", line 326, in make_network
    ftag = make_ftag(word_out)
  File "export/export_bert.py", line 286, in make_ftag
    tf.constant(sd['f_tag.0.weight'].numpy().transpose()),
KeyError: 'f_tag.0.weight'

By looking into the code, it seems that my model does not have named parameters with "f_tag" prefix, and my model does have these with "f_label" prefix.
Does this mean that my model require gold POS tags as the input as well?

Linfeng Song · Answer 1 · Fri Mar 27 2020 00:45:47 GMT+0800 (China Standard Time)

I did not set "--use-tags" or "--predict-tags" for training.

Nikita Kitaev · Answer 2 · Fri Mar 27 2020 01:40:53 GMT+0800 (China Standard Time)

The export_bert.py script was written with the expectation that you enable --predict-tags during training. The f_tag weights that it can't find are from the POS tag prediction head. It looks like you'll need to adjust either the training or the export script to get things to run.

Linfeng Song · Answer 3 · Fri Mar 27 2020 01:49:39 GMT+0800 (China Standard Time)

Hi Kitaev,

Thank you for your reply.
The README says that the --predict-tags is just used for auxiliary losses. Does that also change what model require as the input as well? Specifically, do I need to provide POS tags as additional input to my model, if I didn't use --predict-tags for model training?

I just need a model that works the same way as the released models (e.g. "benepar_zh").

Nikita Kitaev · Answer 4 · Fri Mar 27 2020 02:09:45 GMT+0800 (China Standard Time)

The model inputs don't change if you train with --predict-tags. The training data does, however, need to contain POS tags -- they are used for supervision only, and not as an input.

The way to match a released model is to enable --predict-tags during training.

You can also modify the export script to not do anything related to POS tags, if you don't want to re-train the model.

Linfeng Song · Answer 5 · Fri Mar 27 2020 02:27:12 GMT+0800 (China Standard Time)

Hi Kitaev,

Does that mean the parser performs POS tagging as token-level label prediction even without the --predict-tags switch?
Thanks.

Nikita Kitaev · Answer 6 · Fri Mar 27 2020 05:23:30 GMT+0800 (China Standard Time)

That's right, that option enables token-level POS tag prediction.

Linfeng Song · Answer 7 · Fri Mar 27 2020 05:43:20 GMT+0800 (China Standard Time)

HI @nikitakit
Thanks for your reply. My previous question wasn't clear.
I'd updated it, and can you please verify it? Thanks. That should be my last one.
Thanks again!

Linfeng Song · Answer 8 · Sat Mar 28 2020 06:58:58 GMT+0800 (China Standard Time)

I retrained my model with --predict-tags and successfully generated my meta.json, model.pb and vocab.txt.
After packing them into a zip file, loading the zip file with benepar reports the following error:

Traceback (most recent call last):
  File "eval_ctb.py", line 10, in <module>
    parser = benepar.Parser("/data2/lfsong/exp.parsing/servc.chinese/cn_roberta_aux.zip")
  File "/data/home/lfsong/anaconda3/lib/python3.7/site-packages/benepar/nltk_plugin.py", line 36, in __init__
    super(Parser, self).__init__(name, batch_size)
  File "/data/home/lfsong/anaconda3/lib/python3.7/site-packages/benepar/base_parser.py", line 199, in __init__
    graph_def = tf.GraphDef.FromString(model)
google.protobuf.message.DecodeError: Error parsing message

Linfeng Song · Answer 9 · Fri Apr 10 2020 02:33:40 GMT+0800 (China Standard Time)

Hi @nikitakit

Can you release an expert.py script that does not assume using ELMO?
The current scripts either assume using ELMO or BERT.
Many thanks!

Nikita Kitaev · Answer 10 · Sat Feb 06 2021 12:59:01 GMT+0800 (China Standard Time)

As of benepar v0.2.0a0, there is no more exporting to tensorflow and you can just use pytorch checkpoints directly. The original exporting code was written as a one-off because I didn't anticipate adding new models, or the explosion in pre-training approaches that we've seen over the past few years.