mr-martian / hbo-UD

conversion of ETCBC Old Testament data to UD

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ancient Hebrew UD

This is an in-progress conversion of the Old Testament in Hebrew using data from https://github.com/etcbc/bhsa

Status

Manually-verified trees can be found in the files named [book].checked.conllu. The statistics are roughly the following, though I will probably forget to update this table, so run make [book]-report for up-to-date numbers.

Book Sentences Words
Genesis 1494 / 1494 (100%) 36741 / 36741 (100%)
Exodus 118 / 1151 (10%) 2260 / 29878 (8%)
Leviticus 53 / 820 (6%) 685 / 21769 (3%)
Numbers 116 / 1179 (10%) 1249 / 28888 (4%)
Deuteronomy 21 / 879 (2%) 224 / 26155 (1%)
Ruth 85 / 85 (100%) 2294 / 2294 (100%)

License

The original data is licensed under Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) and the resulting trees are under the same license.

Persistent link to original: 10.17026/dans-z6y-skyh.

All code in this repository is under the MIT license.

About

conversion of ETCBC Old Testament data to UD

License:MIT License


Languages

Language:Python 95.3%Language:Makefile 3.6%Language:Shell 1.1%