Text mining using term frequency-inverse document frequency
Nearest neighbor algorithm is used to idenify parallels in wikipedia
Used graphlab create's nearest neighbor algorithm to train the data
Term frequency-inverse frequency value increases proportionately as the number of times a word appears in a document.
This is often used by search engines to find the relevance of a search word