dukn / data_volam

create a key-word-dictionary youself

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

data_volam

Info

Using VNTok (vn.hus.nlp.tokenizer-4.1.1-bin) to tokenize Paragraph.

Using TF-IDF to get "Key word" to create a tinyDict about Game "Vo lam truyen ky web".

  • ./vn.hus.nlp.tokenizer-4.1.1-bin/run_by_duc.py to create new data (Data2).

  • Using hardcode.py to create datafile.

  • Using getDict.py to create file tinyDict.txt

  • Enjoy!

Tools

  • python

  • pandas

  • TF-IDF

About

create a key-word-dictionary youself


Languages

Language:Python 88.5%Language:Shell 7.6%Language:Batchfile 3.9%