There are 0 repository under blog-authorship-corpus topic.
This is a classifier trained the Blog Authorship Corpus, a 2004 dataset from blogger including many entire blogs. It's found here https://u.cs.biu.ac.il/~koppel/BlogCorpus.htm I am using bag of words vectorization to convert the blogs into a form machine learning algorithms can work on. I am using an ensemble CART Tree approach at the moment although I may change that later.