menzenski / thrunc

A Python tool for extracting data from the historical subcorpora of the Russian National Corpus.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

thrunc: A Tool for the Historical Russian National Corpus

Thrunc is a Tool for the Historical Russian National Corpus which allows for automation of corpus queries and data collections.

In its present state, thrunc returns simple frequency results by year for past-tense forms of Russian verbs, organized by prefix. The first time it's run, thrunc generates a list of queries in XML format, and then saves results to this list each time it's run thereafter, without repeating or overlapping itself.

About

A Python tool for extracting data from the historical subcorpora of the Russian National Corpus.

License:MIT License


Languages

Language:Python 100.0%