kanata2 / ruigi

Ruigi is the library for computing the similarity between documents.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ruigi

wercker status

Ruigi is similarity calculation library, which is implemented by Ruby.

algorithms

Now, only support TF-IDF and cosine similarity.

Installation

Add this line to your application's Gemfile:

gem 'ruigi'

And then execute:

$ bundle

Or install it yourself as:

$ gem install ruigi

Usage

array of words -> Ruigi::Document

words = ["word1", "word2", ... , "wordN"]
document1 = Ruigi::Document.new(words)

Make a Model from documents of array

corpus = [document1, document2, ... , documentN] # each element's type is Ruigi::Document.
model = Ruigi::Model.new(corpus)

You can get feature vector for each document and calculate similarity between document.

model.feature_vector_of(0) # => return feature vector of 0th document

etc...

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/naoki-k/ruigi.
This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

About

Ruigi is the library for computing the similarity between documents.


Languages

Language:Ruby 100.0%