人名识别对姓“张”识别不太准确
watsonwuh opened this issue · comments
watsonwuh commented
Describe the bug
抽取了一些短语发现张特别容易没识别出来。
如下是具体的例子
张先生对接城西 分词: [张先生/nz, 对接/v, 城西/d]
张先生开封 分词: [张先生/nz, 开封/ns]
张阿姨 分词: [张/q, 阿姨/n]
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
List<Term> list = segment.seg(str);
log.info("##{} 分词: {}", str, ArrayUtils.toString(list));
CoreStopWordDictionary.apply(list);
Describe the current behavior
很多识别出来
Expected behavior
张先生 识别出 张 nr,或者 张先生 nr
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mocos 11.3.1
- Java version: 8
- HanLP version: portable-1.8.3
Other info / logs
无
- I've completed this form and searched the web for solutions.
hankcs commented
1.x已进入维护状态,除恶性bug外不更新。请迁移至2.x:https://hanlp.hankcs.com/demos/pos.html?text=张先生对接城西张阿姨