hyperdrive / word2vec

Automatically exported from code.google.com/p/word2vec

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Patch for /trunk/word2vec.c

GoogleCodeExporter opened this issue · comments

Patch for bug, which caused discarding the last word of vocab after sorting if 
there was no newline character in the input file.

If there is no newline in the input file, vocab[0].cn==0, which is ignored in 
sorting, but is not in the for loop, where it decrements the vocab_size and 
frees the memory of the last word. However, it still computes the hash for the 
last word if its count is greater than min_count. Also the realloc needs to 
allocate only vocab_size * sizeof(struct vocab_word).

Original issue reported on code.google.com by FerroMrkva on 5 Feb 2014 at 11:24

Attachments: