forest-snow / ankura

Anchor-based topic modeling

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ankura

Ankura is an implementation of the anchor words topic modeling algorithm described in Arora et al 2013. The ultimate goal is to experiment with interactive extensions of the algorithm (e.g. Tandem Anchors). Fair warning though: this is research code, and subject to change! For example, the ankura2 branch will eventually blow away the master branch with a new pipeline that should support very large datasets (on the order of 1E8 documents).

About

Anchor-based topic modeling

License:GNU General Public License v3.0


Languages

Language:Python 64.0%Language:JavaScript 28.6%Language:HTML 4.3%Language:CSS 3.1%