AlexPoint / OpenNlp

Open source NLP tools (sentence splitter, tokenizer, chunker, coref, NER, parse trees, etc.) in C#

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Calculating SynsetOffset or reading data file has problem in SharpWordNet

zgrkpnr opened this issue · comments

var engine = new DataFileEngine(@"C:\Users\Ozgur_\Source\Repos\OpenNlp\Resources\WordNet\dict"); var synsets = engine.GetSynsets("apple");

When these two lines of code executed DataFileEngine.cs Line 283
var nt = int.Parse(tokenizer.NextToken()); tries to parse "n" to integer. Because the next token in "35 n 0000 | a hamburger with melted cheese on it" after "35" is "n".

I believe the line dataFile.BaseStream.Seek(synsetOffset, SeekOrigin.Begin); in DataFileEngine.cs Line 275 is misscalculating the offset of the word. Since it is not line offset but "byte offset", it may be calculated wrong.