senya-ashukha / bigram-anchor-words

An Implementation of Bigram Anchor Words algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bigram Anchor Words Topic Model

Implementation for the Bigram Anchor Words Topic Model paper. Bag of words is very poor text representation, since that, in traditional topic models, we are losing a lot of information. The project goal is to combine linguistic with statistical topic models. We propose new Anchor Words Topic Model [1] such as bigrams also could be anchor words.

[1] Sanjeev A., Rong G.: A Practical Algorithm for Topic Modeling with Provable Guarantees (NIPS, 2012)

Results

Here are an example of anchor words. Metrics are also good and could be found in the paper.

Experiments

You could use following code to repeat published results. A simple way to repeat experiments is to try to understand examples =) I'm sorry that documentation is absent.

cd bigram-anchor-words
ipython ./examples/{corpus}/{model}.py

Citation

If you found this code useful please cite our paper

@inproceedings{ashuha2016bigram,
  title={Bigram Anchor Words Topic Model},
  author={Ashuha, Arseniy and Loukachevitch, Natalia},
  booktitle={International Conference on Analysis of Images, Social Networks and Texts},
  pages={121--131},
  year={2016},
  organization={Springer}
}

About

An Implementation of Bigram Anchor Words algorithm


Languages

Language:Python 100.0%