xsthunder / emb_reduce

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DONT USE THIS, use gensim instead, but emb_reduce may caching and gensim may not

Require

fire, tqdm

Usage

retrive word emb from large word emb

python ./reduce_emb/reduce_emb.py --fword wordlist --femb word_emb_file --fout reduced_emb_file

wordlist

relatively small txt

each word perline

word_emb_file

very large txt

optional wordnum:num emb_dim:num for the first line, see ./tests/Tencent_AILab_ChineseEmbedding_sample for example. glove doesnt have this line

word:num dim0 dim1 [<dim2, >] for rest

read until empty line

reduced_emb_file

txt where to output, format will like golve without the first line

want to release to pip

ref

python-snippet/upload_to_pip.md at master · xsthunder/python-snippet

xsthunder/xs_lib

About

License:MIT License


Languages

Language:Jupyter Notebook 67.8%Language:Python 24.5%Language:Shell 4.3%Language:Batchfile 2.0%Language:PowerShell 1.4%