embedding-benchmark

Word Embedding benchmark project By Shahid Beheshti University NLP Lab

Please read Our Wiki Page for more information

Folder structure :

data/corpus This must be empty as the codes will downlaod the corpus from some external repository to here.
data/analogy Contains the analogy dataset(s)
data/wordsim Contains the word similarity dataset(s)
data/categories Contains the catgories dataset(s)
code This folder contains codes that will be used to run all evaluation related tasks and utulities to downlaod the corpus files
scripts This folder contains cleansing/crawling and any other once off activity that needs to be done.

About

Word Embedding benchmark project By Shahid Beheshti University NLP Lab

GNU General Public License v3.0

Language:Python 100.0%