dsindex / syntaxnet

reference code for syntaxnet

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to use conll2017 baseline ?

zhou-zh opened this issue · comments

Thanks for your great works!I saw your reply on stackoverflow, i know you have built your own system, i have two problemsa about it:

  1. You trianed your model on English, i also trian once. Official offer different language models for conll2017 baselines,i don't konw which are entries for modify scipt to train differnt language models?
  2. Your eval scipt is well , but their README mentioned the baseline_eval.py can't find , do you know where is it?
    I am sorry for that my problems maybe not directly related to your models. But those are really important for me,if you know please tell me , thanks very much.

hello~

  1. i understand that you want to train other language model. if then, you can check this issue.
  • #21 (comment)
  • to train, basically, downloading UD corpuses required.
  • after downloading, modify train_dragnn.sh for a language and run the script.
SRC_CORPUS_DIR=${CDIR}/UD_English
TRAIN_FILE=${DATA_DIR}/en-ud-train.conllu.conv 
DEV_FILE=${DATA_DIR}/en-ud-dev.conllu.conv
  1. you can check tensorflow/models#1211 (comment)

@dsindex , thanks for your reply !
If i hope change a language to trian, I should just modify the path to the data set for corresponding language ?
We do not need to use the different models provided by the CoNLL2017 baselines guide ?
I thought that different lanuage models have different word-map.

@continuesmile
yes~
place a corpus to the path and modify script for training your own model.

the models provided by the CoNLL2017 baselines guide were trained by https://github.com/tensorflow/models/tree/master/syntaxnet/dragnn/tools

those script are the original one. mine is modified version for convenience.

Hi @dsindex,

Should I train the segmentation by myself ?
I trained the model with UD Chinese Corpus, but the UAS, LAS only 68.36%, 58.96%, much worse than baseline. Do you have some hint ?

Thanks again