sergey-tihon / OpenNLP.NET

OpenNLP for .NET

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to convert model .bin to .nbin, vice versa

vineet-singh26 opened this issue · comments

Hi folks,
I need some help/ guidance in converting models from https://opennlp.sourceforge.net/models-1.5/ to .nbin format.
I am stuck with this. I tried model convertor shared https://www.codeproject.com/articles/12109/statistical-parsing-of-english-sentences?display=print&fid=229482&df=90&mpp=25&sort=Position&view=Normal&spc=Relaxed&fr=101&prof=True but I didn't find any success.

Please, any help here is really appreciative. Thankyou !!

Error: Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection JavaBinaryGisModelReader

What is .nbin and why you decided to convert?

If you use this project (OpenNLP.NET) is perfectly fine to use original models without any conversion. You can check the code samples in tests https://github.com/sergey-tihon/OpenNLP.NET/blob/master/tests/OpenNLP.NET.Tests/Tests.cs#L117-L128

I was looking for a POSTagging model when I came across your project. Amazing work you've done here. But the project is dependent on IKVM which we can't introduce into our codebase because of security reasons. I'd have loved to use your project but unfortunately I had to look for alternatives and I came across https://github.com/AlexPoint/OpenNlp
It uses .nbin models

can you please help me here @sergey-tihon ?

I see that you already created issues in another repo AlexPoint/OpenNlp#36

Sorry, I know nothing about .nbin format.

Close this issue because it is not related to this repo.

You can take a look at models here https://github.com/AlexPoint/OpenNlp/tree/master/Resources/Models

Then maybe check out repo, try to compile, and take a look at the trainers code. Somewhere in the repo should be code that create *.nbin files

But the project is dependent on IKVM which we can't introduce into our codebase because of security reasons.

Do you know any security vulnerabilities of IKVM? Just wonder what they are about.

Hi @sergey-tihon ,
Thanks for quick reply!!

We get this remark from our security team.
"IKVM is a java virtual machine implemented in .NET. Outside of the license risk that seems like a pretty broad attack surface."

we are looking for an MIT based solution. I admired your work with stanford.nlp, but that too used GPL license with IKVM. That's the issue. Thanks !!!
If you have some suggestions, that'll be appreciated

The license risk concern is pretty unfounded, unless there's a general concern with all Java, as all JVMs are under the same license at this point. And is silly if you're using, say, Linux, since the OpenJDK license is less restricted than Linux itself.

It being a JVM and a JVM being big is a potential concern I suppose. More of "the unknown." It's weird because it's not like a bunch of unused classes merely existing that you arent' accessing should be any more of a concern than the thousands of .NET runtime classes that exist that you don't use are.