duydo / elasticsearch-analysis-vietnamese

Vietnamese Analysis Plugin for Elasticsearch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error library libcoccoc_tokenizer_jni

newgate1999 opened this issue · comments

Em chào anh ạ,
Khi em build plugin này và cài vào được elasticsearch, nhưng hệ thống elastic không start được do lỗi plugin. em có thử chạy test plugin thì nó ra lỗi liên quan đến lib của coccoc. Em đã cài coccoc_tokenizer và test tách chữ bằng câu lệnh command được rồi ạ.
Đây là lỗi chạy test Elasticsearch.

java.lang.UnsatisfiedLinkError: no libcoccoc_tokenizer_jni in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]

at __randomizedtesting.SeedInfo.seed([A62D7052A2063474:DD7C0E493E7BBEF2]:0)
at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2670)
at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830)
at java.base/java.lang.System.loadLibrary(System.java:1873)
at com.coccoc.Tokenizer.<clinit>(Tokenizer.java:15)
at org.apache.lucene.analysis.vi.VietnameseTokenizerImpl.lambda$new$0(VietnameseTokenizerImpl.java:54)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at org.apache.lucene.analysis.vi.VietnameseTokenizerImpl.<init>(VietnameseTokenizerImpl.java:53)
at org.apache.lucene.analysis.vi.VietnameseTokenizer.<init>(VietnameseTokenizer.java:45)
at org.apache.lucene.analysis.vi.VietnameseAnalyzer.createComponents(VietnameseAnalyzer.java:88)
at org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:136)
at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:199)
at org.elasticsearch.index.analysis.AnalysisRegistry.checkVersions(AnalysisRegistry.java:637)
at org.elasticsearch.index.analysis.AnalysisRegistry.produceAnalyzer(AnalysisRegistry.java:601)
at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:520)
at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:207)
at org.elasticsearch.index.analysis.AnalysisTestsHelper.createTestAnalysisFromSettings(AnalysisTestsHelper.java:56)
at org.elasticsearch.index.analysis.AnalysisTestsHelper.createTestAnalysisFromSettings(AnalysisTestsHelper.java:40)
at org.elasticsearch.index.analysis.VietnameseAnalysisTests.createTestAnalysis(VietnameseAnalysisTests.java:119)
at org.elasticsearch.index.analysis.VietnameseAnalysisTests.testVietnameseAnalysis(VietnameseAnalysisTests.java:28)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:829)

NOTE: leaving temporary files on disk at: /tmp/org.elasticsearch.index.analysis.VietnameseAnalysisTests_A62D7052A2063474-001
NOTE: test params are: codec=Asserting(Lucene87): {}, docValues:{}, maxPointsInLeafNode=747, maxMBSortInHeap=5.336157517600498, sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=bez, timezone=Asia/Rangoon
NOTE: Linux 5.8.0-53-generic amd64/Amazon.com Inc. 11.0.11 (64-bit)/cpus=8,threads=1,free=150045472,total=195035136
NOTE: All tests run in this JVM: [VietnameseAnalysisTests]

Em mong nhận được câu trả lời sớm nhất từ anh ạ. Em cảm ơn anh.

@newgate1999 Em copy file libcoccoc_tokenizer_jni.so vào một trong những thư mục /usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib rồi start lại ES nhé.

em cảm ơn anh ạ. em đã cài đặt được thư viện này rồi ạ

em cảm ơn anh ạ. em đã cài đặt được thư viện này rồi ạ

Bạn cho mình hỏi là thư viện libcoccoc_tokenizer_jni nằm ở đâu vậy, mình tìm không thấy.

@newgate1999 Em copy file libcoccoc_tokenizer_jni.so vào một trong những thư mục /usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib rồi start lại ES nhé.

Anh ơi cho em hỏi là thư viện libcoccoc_tokenizer_jni nằm ở đâu vậy, emtìm không hoài thấy ạ.

@newgate1999 Em copy file libcoccoc_tokenizer_jni.so vào một trong những thư mục /usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib rồi start lại ES nhé.

Anh ơi cho em hỏi là thư viện libcoccoc_tokenizer_jni nằm ở đâu vậy, emtìm không hoài thấy ạ.

Nằm trong thư mục cocco_tokenizer/build sau khi em build nha.

Anh @duydo oi, em có làm theo hướng dẫn build nhưng nó báo lỗi ko tìm thấy libcoccoc_tokenizer_jni.dylib , Em sai ở bước nào ạ.

admin@Admins-MacBook-Pro build % make install
[ 12%] Generating coccoc-tokenizer.jar
../java/build_java.sh: line 24: /Library/Internet: No such file or directory
clang: error: no such file or directory: 'Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/include'
clang: error: no such file or directory: 'Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/include/darwin'
[ 12%] Built target compile_java
[ 25%] Building CXX object CMakeFiles/dict_compiler.dir/utils/dict_compiler.cpp.o
[ 37%] Linking CXX executable dict_compiler
[ 37%] Built target dict_compiler
[ 50%] Generating multiterm_trie.dump, syllable_trie.dump, nontone_pair_freq_map.dump
[ 50%] Built target compile_dict
[ 62%] Building CXX object CMakeFiles/tokenizer.dir/utils/tokenizer.cpp.o
[ 75%] Linking CXX executable tokenizer
[ 75%] Built target tokenizer
[ 87%] Building CXX object CMakeFiles/vn_lang_tool.dir/utils/vn_lang_tool.cpp.o
[100%] Linking CXX executable vn_lang_tool
[100%] Built target vn_lang_tool
Install the project...
-- Install configuration: "RELEASE"
-- Installing: /usr/local/bin/tokenizer
-- Installing: /usr/local/bin/vn_lang_tool
-- Installing: /usr/local/bin/dict_compiler
-- Up-to-date: /usr/local/include/tokenizer
-- Up-to-date: /usr/local/include/tokenizer/auxiliary
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/utf8
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/utf8/core.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/utf8/unchecked.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/utf8/checked.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/tsl
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/tsl/robin_set.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/tsl/robin_growth_policy.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/tsl/robin_map.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/tsl/robin_hash.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_memory.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_utils.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_smartptr.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_stdint.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_timer.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_config.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_traits.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp/spp_dlalloc.h
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/utf8.h
-- Up-to-date: /usr/local/include/tokenizer
-- Up-to-date: /usr/local/include/tokenizer/token.hpp
-- Up-to-date: /usr/local/include/tokenizer/helper.hpp
-- Up-to-date: /usr/local/include/tokenizer/tokenizer.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/vn_lang_tool.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/file_serializer.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/utf8
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/tsl
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/buffered_reader.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/hash_trie_node.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/multiterm_hash_trie_node.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/hash_trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/syllable_hash_trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/string_set_trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/syllable_hash_trie_node.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/multiterm_da_trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/da_trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/syllable_da_trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/syllable_da_trie_node.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/multiterm_hash_trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/multiterm_da_trie_node.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie/da_trie_node.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie.hpp
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp
-- Up-to-date: /usr/local/include/tokenizer
-- Up-to-date: /usr/local/include/tokenizer/auxiliary
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/utf8
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/tsl
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/trie
-- Up-to-date: /usr/local/include/tokenizer/auxiliary/sparsepp
-- Installing: /usr/local/include/tokenizer/config.h
-- Up-to-date: /usr/local/share/tokenizer/dicts_text
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/vndic_multiterm
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/nontone_pair_freq
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/special_token.weak
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/Freq2NontoneUniFile
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/acronyms
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/keyword.freq
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/special_token.strong
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/tokenizer/chemical_comp
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/vn_lang_tool
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/vn_lang_tool/alphabetic
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/vn_lang_tool/d_and_gi.txt
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/vn_lang_tool/numeric
-- Up-to-date: /usr/local/share/tokenizer/dicts_text/vn_lang_tool/i_and_y.txt
-- Up-to-date: /usr/local/share/tokenizer/dicts
-- Up-to-date: /usr/local/share/tokenizer/dicts/alphabetic
-- Up-to-date: /usr/local/share/tokenizer/dicts/d_and_gi.txt
-- Up-to-date: /usr/local/share/tokenizer/dicts/numeric
-- Up-to-date: /usr/local/share/tokenizer/dicts/i_and_y.txt
-- Installing: /usr/local/share/tokenizer/dicts/multiterm_trie.dump
-- Installing: /usr/local/share/tokenizer/dicts/syllable_trie.dump
-- Installing: /usr/local/share/tokenizer/dicts/nontone_pair_freq_map.dump
-- Installing: /usr/local/share/java/coccoc-tokenizer.jar
CMake Error at cmake_install.cmake:111 (file):
file INSTALL cannot find
"/Users/admin/Desktop/ElasticSearch/coccoc-tokenizer/build/libcoccoc_tokenizer_jni.dylib":
No such file or directory.

make: *** [install] Error 1

@duydo a oy, e tìm trong C:\Dev Programs\CTokenizer\coccoc-tokenizer\build nhưng ko thấy coccoc_tokenizer_jni đâu ak, e mắc lỗi: java.lang.UnsatisfiedLinkError: no coccoc_tokenizer_jni in java.library.path: /usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib ak.
Với cả cho e hỏi, cái file coccoc-tokenizer.jar sau khi generate ra thì dùng như thế nào để cài plugin elasticsearch-analysis-vietnamese ak?

@phat-go2joy Em có cài JDK trước chưa?

@dinhan92 Cái tokenizer c++ chỉ hỗ trợ Linux, Mac OS thôi em nha, chưa hỗ trợ Windows.