sigmorphon2020 / task0-data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

:: Data :: Evaluation :: Development Languages :: Surprise Languages :: Results :: Cite


This repository contains the Development and Surprise languages for SIGMORPHON 2020 Task 0 (Typologically Diverse Morphlogical Inflection).
All data files are in canonicalized format (see tags.yaml for possible tags): tags start with POS, and features follow lexicographic order. Tags follow UniMorph schema that was slightly extended with a few new tags (such as ``1+INCL''; due to new languages). We additionally `provide' WALS features here for those who choose to use them (allowed for both constrained and unconstrained submissions).


Update: Please use to evaluate your systems.

python --gold gld.tst --sysout pred.tst

The official evaluation script lives in this directory. You may run the evaluation script as shown in the example below.

python3 --hyp lang.hyp --ref

accuracy:	67.30
levenshtein:	0.93



Family Genus Code Language Source of Data Annotator
Austronesian Barito mlg(plt) Malagasy Modern Malagasy Verbs. CreateSpace Independent Publishing Platform Jennifer White
Greater Central Philippine ceb Cebuano ---- Ran Zmigrod
Greater Central Philippine hil Hiligaynon Hiligaynon Language: 101 Hiligaynon Verbs by Anj Santos Ran Zmigrod
Greater Central Philippine tgl Tagalog Center for Southeast Asian Studies, NIU Jennifer White
Oceanic mao(mri) Maori Jennifer White
Indo-European Germanic ang Old English UniMorph
Germanic dan Danish UniMorph
Germanic deu German UniMorph
Germanic eng English UniMorph
Germanic frr North Frisian UniMorph
Germanic gmh Middle High German UniMorph
Germanic isl Icelandic UniMorph
Germanic nld Dutch UniMorph
Germanic nob Norwegian Bokmål UniMorph
Germanic swe Swedish UniMorph
Niger-Congo Bantoid kon(kng) Kongo Modern Kongo Verbs. CreateSpace Independent Publishing Platform Jennifer White
Bantoid lin Lingala ---- ---
Bantoid lug Luganda Namono, Mirembe. (2018). Luganda language: 101 Luganda verbs. CreateSpace Independent Publishing Platform. Edoardo Ponti
Bantoid nya Chewa Modern Chewa Verbs. Master the basic tenses. CreateSpace Independent Publishing Platform Ryan Cotterell
Bantoid sot Sotho ---- ---
Bantoid swa(swh) Swahili 102 Swahili Verbs. CreateSpace Independent Publishing Platform Jennifer White
Bantoid zul Zulu ---- ---
Kwa aka Akan Imbeah, Paa Kwesi. (2012). 102 Akan verbs. CreateSpace Independent Publishing Platform. Tiago Pimentel
Kwa gaa 102 Ga verbs. CreateSpace Independent Publishing Platform. Tiago Pimentel
Oto-Manguean Amuzgoan azg San Pedro Amuzgos Amuzgo Surrey Morphology Group Antonis Anastasopoulos
Chichimec pei Chichimeca-Jonaz Surrey Morphology Group Antonis Anastasopoulos
Chinantecan cpa Tlatepuzco Chinantec Surrey Morphology Group Antonis Anastasopoulos
Mixtecan xty Yoloxóchitl Mixtec Surrey Morphology Group Antonis Anastasopoulos
Otomian ote Mezquital Otomi Surrey Morphology Group Antonis Anastasopoulos
Otomian otm Sierra Otomi Surrey Morphology Group Antonis Anastasopoulos
Zapotecan cly Eastern Highland Chatino Cruz, Hilaria; Anastasopoulos, Antonis and Stump, Gregory. 2020 (to appear at LREC). A Resource for Studying Chatino Verbal Morphology Antonis Anastasopoulos
Zapotecan ctp Yaitepec Chatino Surrey Morphology Group Antonis Anastasopoulos
Zapotecan czn Zenzontepec Chatino Surrey Morphology Group Antonis Anastasopoulos
Zapotecan zpv Chichicapan Zapotec Surrey Morphology Group Antonis Anastasopoulos
Uralic Finnic est Estonian UniMorph
Finnic fin Finnish UniMorph
Finnic izh Ingrian UniMorph
Finnic krl Karelian VepKar Natalia Krizhanovskaya
Finnic liv Livonian ---- ---
Finnic vep Veps VepKar Natalia Krizhanovskaya
Finnic vot Votic UniMorph
Mari mhr Meadow Mari Tim Arkhangelskij Liz Salesky and Ekaterina Vylomova
Mordvin mdf Moksha Tim Arkhangelskij Liz Salesky and Ekaterina Vylomova
Mordvin myv Erzya Tim Arkhangelskij Liz Salesky and Ekaterina Vylomova
Saami sme Northern Sami UniMorph


Family Genus Code Language Source of Data Annotator
Afro-Asiatic Semitic mlt Maltese UniMorph
Lowland East Cushitic orm Oromo Irene Nikkarinen
Semitic syc Syriac UniMorph
Algic Algonquian cre Cree Hunter, James. (1923). A lecture on the grammatical construction of the Cree language and Paradigms of the Cree Verb, with its various conjugations, moods, tenses, inflections, &c. The Society for Promoting Christian Knowledge. London. (Original work published 1875). Eleanor Chodroff
Altaic Tungusic evn Evenki Elena Klyachko Elena Klyachko
Turkic aze(azb) Azerbaijani UniMorph
Turkic bak Bashkir
Turkic crh Crimean Tatar
Turkic kaz Kazakh 1) Nabiyev, Temir. (2015). Kazakh language: 101 Kazakh verbs. Preceptor Language Guides. Great Britain. 2) Turkicum. (2019). The Kazakh verbs: Review guide. Turkicum. Great Britain. Eleanor Chodroff
Turkic kir Kyrgyz Aytnatova, Alima. (2016). Kyrgyz language: 100 Kyrgyz verbs fully conjugated in all tenses. CreateSpace Independent Publishing Platform. Middletown, DE. Eleanor Chodroff
Turkic kjh Khakas
Turkic tuk Turkmen 1)Abdulin, Murat. (2016). Turkmen verbs: 100 Turkmen verbs conjugated in all tenses. CreateSpace Independent Publishing Platform. 2)Peace Corps (n.d.). 501 Turkmen verbs. US Embassy in Turkmenistan. Eleanor Chodroff
Turkic uig Uyghur Kadeer, Alim. Uyghur language: 94 Uyghur verbs in common tenses. CreateSpace Independent Publishing Platform. Eleanor Chodroff
Turkic uzb Uzbek 1) Abdullaev, Daniyar. (2016). Uzbek language: 100 Uzbek verbs conjugated in common tenses. CreateSpace Independent Publishing Platform. 2) Turkicum. (2019). The Uzbek verbs: Review guide. Turkicum. Great Britain. Eleanor Chodroff
Dravidian Southern Dravidian kan Kannada UniMorph
South-Central Dravidian tel Telugu UniMorph
Indo-European Indic ben Bengali UniMorph
Indic hin Hindi
Indic san Sanskrit UniMorph
Indic urd Urdu UniMorph
Iranian fas(pes) Persian UniMorph
Iranian pus(pst) Pashto UniMorph
Iranian tgk Tajik Eleanor Chodroff
Romance ast Asturian UniMorph
Romance cat Catalan UniMorph
Romance frm Middle French UniMorph
Romance fur Friulian UniMorph
Romance glg Galician UniMorph
Romance lld Ladin UniMorph
Romance vec Venetian UniMorph
Romance xno Anglo-Norman
- West Germanic gml Middle Low German UniMorph
West Germanic gsw Swiss German Egli-Wilde, Renate. Züritüütsch verstaa, Züritüütsch rede. Ryan Cotterell
North Germanic nno Norwegian Nynorsk UniMorph
Niger-Congo Bantoid sna Shona Shona Language: 101 Shona Verbs by Idai Nandoro Rowan Hall Maudslay
Sino-Tibetan Bodic bod Tibetan Di et al., 2019 Qianji Di
Siouan Core Siouan dak Dakota LaFontaine, Harlan & McKay, Neil. (2004). 550 Dakota verbs. Minnesota Historical Society. St. Paul, MN. Eleanor Chodroff
Songhay Songhay dje Zarma Ran Zmigrod
Southern Daly Murrinh-Patha mwf Murrinh-Patha John Mansfield John Mansfield
Uralic Permic kpv Komi-Zyrian Tim Arkhangelskij Liz Salesky and Ekaterina Vylomova
Finnic lud Ludic VepKar Natalia Krizhanovskaya
Finnic olo Livvi VepKar Natalia Krizhanovskaya
Permic udm Udmurt Tim Arkhangelskij Liz Salesky and Ekaterina Vylomova
Finnic vro Võro Vitalij Chernyavskij Ekaterina Vylomova
Uto-Aztecan Tepiman ood O'odham Zepeda, Ofelia. (2003). A Tohono O'odham grammar. University of Arizona Press. (Original work published 1983). Eleanor Chodroff


The shared task received 23 systems from 10 teams. The results on test data are available HERE


@article{vylomova2020sigmorphon, title={SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection}, author={Vylomova, Ekaterina and White, Jennifer and Salesky, Elizabeth and Mielke, Sabrina J and Wu, Shijie and Ponti, Edoardo and Maudslay, Rowan Hall and Zmigrod, Ran and Valvoda, Josef and Toldova, Svetlana and others}, journal={SIGMORPHON 2020}, pages={1}, year={2020} }



Language:Scilab 100.0%Language:Python 0.0%