UniversalDependencies / UD_Macedonian-MTB

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Summary

The Macedonian-MTB treebank is a collection of annotated sentences based on the raw monolingual corpus called Macedonian Language Digital Resources (MLDR).

Introduction

The Macedonian-MTB treebank is a collection of annotated sentences based on the raw monolingual corpus called Macedonian Language Digital Resources - MLDR, a.k.a 135 Volumes of Macedonian Literature, published by the Macedonian Academy of Sciences and Arts under the CC Attribution-NonCommercial 4.0 International License. The treebank consists mainly of literary and a few non-fiction texts.

  1. A description of the treebank and its origin (creation method, data sources, etc.)
  2. A description of how the data was split into training, development and test sets
  3. If there are multiple genres/domains, can they be told apart by sentence ids? Does the treebank consist of complete documents, or just randomly shuffled sentences?
  4. Acknowledgments and references that should be cited when using the treebank
  5. A changelog section for treebanks that will be released for the second (or subsequent) time. ...

Acknowledgments

...

References

Changelog

  • 2023-11-15 v2.13
    • Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.13
License: CC BY-SA 4.0
Includes text: yes
Genre: grammar-examples
Lemmas: manual native
UPOS: manual native
XPOS: not available
Features: manual native
Relations: manual native
Contributors: Cvetkoski, Vladimir
Contributing: here
Contact: cvetkoski@flf.ukim.edu.mk
===============================================================================

About

License:Other