antalsz / text-set

Use a DAFSA (aka a DAWG) to implement a set of strings

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

text-set

Use a DAFSA (directed acyclic finite state automaton; aka a DAWG, a directed acyclic word graph) to implement a set of strings.

The algorithm for building the DAFSA is Algorithm 1 from the paper “Incremental Construction of Minimal Acyclic Finite-State Automata”, by Jan Daciuk, Stoyan Mihov, Bruce W. Watson, and Richard E. Watson. Published in 2000 in Computational Linguistics 26(1), pp.3-16. Available online at https://www.aclweb.org/anthology/J00-1002.

The ENABLE wordlist, in ENABLE-wordlist.txt, was downloaded from Peter Norvig’s page about Natural Language Corpus Data.

About

Use a DAFSA (aka a DAWG) to implement a set of strings

License:MIT License


Languages

Language:Haskell 100.0%