hankcs / HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Home Page:https://hanlp.hankcs.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Java API, com.hankcs.hanlp.model.perceptron.PerceptronLexicalAnalyzer is not serializable.

cumtfc opened this issue · comments

Describe the bug
Java API, while com.hankcs.hanlp.model.perceptron.PerceptronLexicalAnalyzer.class be used as a member variable in Spark, it throws "Caused by: java.io.NotSerializableException: com.hankcs.hanlp.model.perceptron.PerceptronLexicalAnalyzer".
That is because spark try to serialize PerceptronLexicalAnalyzer , but this class didn`t implements Serializable interface.
Code to reproduce the issue

public class Test{
private final PerceptronLexicalAnalyzer analyzer;
public Test() throws IOException {
analyzer = new PerceptronLexicalAnalyzer("./cws.bin",
"./pos.bin",
"./ner.bin");
}
public static void main(String[] args) throws IOException {
Test a = new Test();
try {
FileOutputStream fileOut = new FileOutputStream("./test.ser");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
out.writeObject(a);
out.close();
fileOut.close();
System.out.println("Serialized data is saved in ./test.ser");
} catch (IOException i) {
i.printStackTrace();
}
}
}

Describe the current behavior
Caused by: java.io.NotSerializableException: com.hankcs.hanlp.model.perceptron.PerceptronLexicalAnalyzer.

Expected behavior
No Exception been threw.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 11
  • HanLP version: portable-1.8.3
  • I've completed this form and searched the web for solutions.

Hi, PerceptronLexicalAnalyzer is not meant to be serializable. You can serialize each model instead. Or, implement a serializable subclass and do serialization of models by yourself.