yen3 / esapp

[In Progress] Implementation of unsupervised Chinese word segmentation algorithm.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ESA++

References

  • Hanshi Wang, Jian Zhu, Shiping Tang, and Xiaozhong Fan. 2011. A new unsupervised approach to word segmentation. Computational Linguistics, 37(3):421-454.
  • Ge Nong, Sen Zhang, and Wai Hong Chan. 2011. Two Efficient Algorithms for Linear Time Suffix Array Construction. IEEE Transactions on Computers, 60(10):1471-1484.
  • Mohamed Ibrahim Abouelhoda, Stefan Kurtz, and Enno Ohlebusch. 2004. Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms, 2(1):53-86.

License

Copyright (c) 2014-2015, Chi-En Wu.

Distributed under The BSD 3-Clause License.

About

[In Progress] Implementation of unsupervised Chinese word segmentation algorithm.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:C++ 91.2%Language:CMake 4.5%Language:Python 4.0%Language:Makefile 0.3%