PatrickLerner / erclaerung

Yet another text analyzer and classifier for Early Modern German

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

erclaͤrung™

erclaͤrung™ was born out of a simple dream: To create the best analyzer and text classifier for Early Modern German texts in the world.

To achieve this lofty (yet noble) goal, we are primarily using the dkpro-core and dkpro-tc framework for Java.

The primary corpora which is used for this project is the Bonner Frühneuhochdeutschkorpora. It is primarily serves as an already properly tagged and annotated source which can be utilized to compare new texts against.

If you want to know more about the project, you can check out our awesome presentation on Google Drive (German only).

Note that due to the licensing terms of the Bonner corpora, it is not supplied with this project. You must download the XML files from their website and put them into the src/main/resources/bonner_korpora folder of this project. Please do not forget to copy the Fnhd.dtd file along with the XML-files.

About

Yet another text analyzer and classifier for Early Modern German

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Java 100.0%