danieldk / conllu-utils

Utilities for working with CoNLL-U

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CoNLL-U Utilities

Introduction

This is a set of utilities to process files in the CoNLL-U format. The conllu command provides the following subcommands:

  • accuracy: compute the accuracy of a system based on two treebanks
  • cleanup: normalize unicode and replace unicode punctuation
  • compare: compare two treebanks on one or more layers
  • from-text: convert tokenized text files to CoNLL-U.
  • merge: merge CoNLL-U files
  • partition: partition a CoNLL-U file in N files.
  • shuffle: shuffle the sentences in a CoNLL-U file.
  • to-text: convert CoNLL-U to tokenized plain text.

Usage

Executing a subcommand gives usage information when --help is given as an argument.

About

Utilities for working with CoNLL-U


Languages

Language:Rust 89.8%Language:Nix 10.2%