This repository includes three open source versions of LDA with collapsed Gibbs Sampling, modified by nanjunxiao.
GibbsLDA++ single thread,written in C++
ompi-lda multi-node/multi-threads, written in C++
online_twitter_lda multi-threads,written in Python
collapsed Gibbs LDA reference : my blog
fixed bugs:
1). memory leakage. 'delete[] p' instead of 'delete p',when p points to an Array.
2). Array out of bound. (double)random() / RAND_MAX in [0,1]
int topic = (int)(((double)random() / RAND_MAX) * K); --> int topic = (int)(((double)random() / RAND_MAX + 1) * K);
double u = ((double)random() / RAND_MAX) * p[K - 1]; --> double u = ((double)random() / RAND_MAX + 1) * p[K - 1];
fixed bug:
1). infer.cc bugs.
2). rm 'sampler.UpdateModel(corpus)' in lda.cc.
add features:
1). add theta twords file output.
2). add partial boost's hpp/cpp in include dir, so can make directly.
add features:
1). add theta phi mat file output.
1). twordsnum can configure.
2). rewrite cmd_flag without boost, so can remove include dir.
3). rewrite makefile.