rgtjf / Semantic-Texual-Similarity-Toolkits

Semantic Textual Similarity (STS) measures the degree of equivalence in the underlying semantics of paired snippets of text.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Semantic Textual Similarity Toolkits

Gitter

This is the code by ECNU team submitted to SemEval STS Task.

slides

Installation

# download the repo
git clone https://github.com/rgtjf/Semantic-Texual-Similarity-Toolkits.git
# download the dataset and stanford CoreNLP tools
sh download.sh
# run the demo
python demo.py

Results

you can configure sts_model.py to see the performance of different features on STSBenchmark dataset.

STSBenchmark

Methods Dev Test
RF 0.8333 0.7993
GB 0.8356 0.8022
EN-seven 0.8466 0.8100
---------------------- -------- --------
aligner 0.6991 0.6379
idf_aligner 0.7969 0.7622
BOWFeature-True 0.7584 0.6472
BOWFeature-False 0.7788 0.6874
nGramOverlapFeature 0.7817 0.7453
BOWFeature 0.7639 0.6847
AlignmentFeature 0.8163 0.7748
WordEmbeddingFeature 0.8011 0.7128

Reference

STSBenchmark board

Contacts

Any questions, please feel free to contact us: rgtjf1 AT 163 DOT com

Citation

If you find this responsity helpful, please cite our paper.

@inproceedings{tian-etal-2017-ecnu,
    title = "{ECNU} at {S}em{E}val-2017 Task 1: Leverage Kernel-based Traditional {NLP} features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity",
    author = "Tian, Junfeng  and
      Zhou, Zhiheng  and
      Lan, Man  and
      Wu, Yuanbin",
    booktitle = "Proceedings of the 11th International Workshop on Semantic Evaluation ({S}em{E}val-2017)",
    year = "2017",
    url = "https://aclanthology.org/S17-2028",
    pages = "191--197"
}

About

Semantic Textual Similarity (STS) measures the degree of equivalence in the underlying semantics of paired snippets of text.

License:MIT License


Languages

Language:Python 99.1%Language:Perl 0.7%Language:Shell 0.2%Language:AutoIt 0.0%