abhijangda / text-clustering

An implementation of k-means clustering to find related text documents. Written to solve the Newsle clustering problem in CodeSprint 2012.

Home Page:cs2.interviewstreet.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Solution for Newsle Clustering question from CodeSprint 2012. Implements clustering of text documents using Cosine or Jaccard distance between the feature vectors of the documents together with k means clustering.

About

An implementation of k-means clustering to find related text documents. Written to solve the Newsle clustering problem in CodeSprint 2012.

cs2.interviewstreet.com


Languages

Language:Java 99.7%Language:Shell 0.3%