Cannot train POS on another corpus ...
elduba opened this issue · comments
Hi !
First of all, i would like to thank you for this great tool and convert file you provide, it works just great.
But i am facing some issues with the french corpua.
could you please correct / complete my understanding of the configuration activities required for training on a another copus ? :
- Create a new folder in work (in my example UD_French) with 3 files : *-ud-dev.conllu / *-ud-test.conllu / **-ud-train.conllu
- Add the context.pbtxt and update file location value + record-format to "french-text"
- Update train.sh with correct file location value
- Run train.sh
Than I am stuck with that error :
File "/home/baduel/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/external/tf/tensorflow/python/client/session.py", line 673, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: indices[0] = -1 is not in [0, 1)
[[Node: training/embedding_lookup_4 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@training/Diag"], validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](training/Diag, training/gold_actions)]]
Caused by op u'training/embedding_lookup_4', defined at:
File "/home/baduel/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/parser_trainer.py", line 303, in <module>
tf.app.run()
Any help would be life saving :)
Best regards
Edulba
@elduba
i noticed that there is no XPOS in UD_French corpus.
1 Les le DET _ Definite=Def|Gender=Fem|Number=Plur 2 det _ _
2 commotions commotion NOUN _ Gender=Fem|Number=Plur 5 nsubj _ _
so, i modified convert.py script.
if tokens[4] == '_' :
tokens[4] = tokens[3] # there is no XPOS
else :
tokens[3] = tokens[4] # UPOS <- XPOS
it works :
...
I syntaxnet/reader_ops.cc:141] Starting epoch 1
INFO:tensorflow:Epochs: 1, num steps: 100, seconds elapsed: 1.44, avg cost: 2.20,
INFO:tensorflow:Epochs: 1, num steps: 200, seconds elapsed: 2.10, avg cost: 1.40,
INFO:tensorflow:Epochs: 1, num steps: 300, seconds elapsed: 2.76, avg cost: 0.99,
INFO:tensorflow:Epochs: 1, num steps: 400, seconds elapsed: 3.42, avg cost: 0.78,
INFO:tensorflow:Epochs: 1, num steps: 500, seconds elapsed: 4.07, avg cost: 0.69,
INFO:tensorflow:Epochs: 1, num steps: 600, seconds elapsed: 4.74, avg cost: 0.63,
INFO:tensorflow:Epochs: 1, num steps: 700, seconds elapsed: 5.38, avg cost: 0.54,
....
Thanks ! It works very fine
👍