jprante / elasticsearch-plugin-bundle

A bundle of useful Elasticsearch plugins

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to use the decomposer ?

cyclomarc opened this issue · comments

I am looking for an example mapping file that shows how to use the decompounder ? I am familiar with the ES dictionary_decompounder, but if I understand well, the plugin provides a decompounder that does not require a word list. My question is: what is the syntax to be used in the mapping file (filter, analyzer, tokenizer) so that the decompounder is used during analysis ?

Hope you can help
Marc

I have added example configurations at the README

https://github.com/jprante/elasticsearch-plugin-bundle/blob/master/README.md

It is not much but at least a start.

The decompounder is based on the ASV toolbox

http://wortschatz.uni-leipzig.de/~cbiemann/software/toolbox/#_Baseforms

so it should be somehow possible to ramp up a "training environment" to create new parameter files for decompounding. The binary prebuilt files for german decompounding in the plugin are simply copied from ASV toolbox.