mayabot / mynlp

一个生产级、高性能、模块化、可扩展的中文NLP工具包。(中文分词、平均感知机、fastText、拼音、新词发现、分词纠错、BM25、人名识别、命名实体、自定义词典)

Home Page:https://mynlp.mayabot.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot load pre-trained word vectors

liefra opened this issue · comments

I tried to load a pre-trained word vector, but receive the following error: Caused by: java.lang.IllegalArgumentException: Unknown LossName enum second :774911284

I load the model with:
val model = FastText.loadModelFromSingleFile(File("/Users/liefra/crawl-300d-2M.vec"))

Is this an issue, or just me doing it wrong?

commented

SingleFile is one Java Model File used by fasttext4j.
maybe you can use FastText.loadCppModel load bin mode file from Fasttext
https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz
(unzip it before load )

Thank you very much for your super fast reply :)

Yes, it works when I load it with the bin format:
val model = FastText.loadCppModel(File("/Users/liefra/cc.en.300.bin"))