bmschmidt / Bookworm-Mallet

Bookworm Mallet integration

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bookworm-Mallet

Bookworm Mallet integration

This is a Bookworm extension.

This particular extensions supplements a Bookworm's "master_bookcounts" file with a master_topicWords file that otherwise resembles but that includes a topic column, created by MALLET. All the necessary work will happen on running make in the directory; some dependencies are not automatically installed.

Unlike a "good" bookworm extension, this one actually has a few bits of code in the API to support it. (Because that syntax for something other than master_bookcounts isn't transparently supported.) But I think it's worth it in this case, because it's make it possible to break apart a topic model at the unigram level.

Acknowledgements

This runs off of Mallet, the work of a lot of people.

The stopwords list begins with Matt Jockers' list of names to filter for topic modeling.

About

Bookworm Mallet integration


Languages

Language:Python 74.6%Language:R 12.8%Language:Makefile 12.5%