CONLL 2018 Shared Task participation

Question

dumitrescustefan opened this issue 6 years ago · comments

We should participate to this year's shared task: Multilingual Parsing from Raw Text to Universal Dependencies ( http://universaldependencies.org/conll18/ )

This means we need to prepare the code & scripts for this task.

As far as I can see right now, we have three distinct cases:

we have end-to-end decoding
we start from udpipe tokenization and perform tagging, parsing, lemmatization
we start from udpipe tokenization, tagging and lemmatization and we perform parsing

update the runtime CLI to support taking as input a list of models that need to pe run on the input. for instance: --run=[tokenization, parsing, tagging, lemmatization] or --run=[parsing, tagging]
implement scripts tailored for the UD Shared Task, which take as input the supplied XML with the list of input test files and run a custom NLPCube pipeline, depending on the language code
deploy on TIRA testing environment

Training and evaluation scripts, models trained, handling of low-resourced minor languages.