RezaSi / embedding-benchmark

Word Embedding benchmark project By Shahid Beheshti University NLP Lab

Repository from Github https://github.comRezaSi/embedding-benchmarkRepository from Github https://github.comRezaSi/embedding-benchmark

embedding-benchmark

Word Embedding benchmark project By Shahid Beheshti University NLP Lab

Please read Our Wiki Page for more information

Folder structure :

  • data/corpus This must be empty as the codes will downlaod the corpus from some external repository to here.
  • data/analogy Contains the analogy dataset(s)
  • data/wordsim Contains the word similarity dataset(s)
  • data/categories Contains the catgories dataset(s)
  • code This folder contains codes that will be used to run all evaluation related tasks and utulities to downlaod the corpus files
  • scripts This folder contains cleansing/crawling and any other once off activity that needs to be done.

About

Word Embedding benchmark project By Shahid Beheshti University NLP Lab

License:GNU General Public License v3.0


Languages

Language:Python 100.0%