Sentence Similarity based on Semantic Nets and Corpus Statistics

This is an implementation of the paper written by Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett. Link

The Sentence Similarity has been implemented as a linear combination of Semantic and Word order Similarity. Semantic and Word order Similarities are calculated from semantic and order vectors computed for each sentence with the assistance of wordnet.

Modules Required

math
os
time
sys
numpy
sklearn
nltk

from nltk.corupus
- wordnet
- brown
- stopwords

Steps

Download the 2 main programs - similarity.py and main.py.
Construct the folder sub-structure as shown below:

similarity.py has all the main functions and will be called in main.py. Compile similarity.py first to make sure there are no errors. Then call the main.py
Put all the documents(text format) to be compared for similarity in the dataset sub-folder.

About

This is a full implementation of the paper written by Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett

Languages

Language:Python 100.0%