natural-language-processing synonym-discovery word-embedding

Synonym Recognition by Embedding

Relation Resource

Synonym dataset datasets/synonyms/* is built on Chinese Synonym Dataset: 同义词词林.
Pre-train word embedding:

Get Started

Prepare for synonym dataset

You can use datasets/synonyms/* or dataset else you built.

Download the embedding

Download from the above Pre-train word embedding.

Dependencies

You can install dependencies by:

pip install -r requirements.txt

Run

python main.py --train datasets/synonyms/train \
               --dev datasets/synonyms/dev \
               --test datasets/synonyms/test \
               --embedding /path/to/embedding_file \
               --outputs /path/to/outputs_dir

License

@Apache 2.0 (Except for datasets)

About

:bird: Fine-tune Pre-trained Word Embedding for Synonym Recognition

natural-language-processing synonym-discovery word-embedding

Apache License 2.0

Languages

Language:Python 100.0%