分词
huangheLee opened this issue · comments
您好, 请问一下您是如何不同语言的分词的?
Hello, How do you separate phrases with different language?
It depends on each language. Do you mean multiple languages in one sentence?
我知道的 分词依赖于具体语言,但是在您的程序中我没有发现处理具体语言的部分或者我没有调整任何参数, 它就可以处理英语 汉语 越南语等 我的疑问在这里
I know it depends on each language to separate phrases. I can't find anything about this in your program or I did not change any args, but it works fine with English, Chinese and Vietnamese. That's my Question.
---- I have poor English This is translation in baidu below ---
I know that word segmentation depends on the specific language, but in your program I did not find the part dealing with the specific language or I did not adjust any parameters, it can deal with English, Chinese, Vietnamese and other questions here.
This project is independent with word segmentation. You can choose any algorithm you want for word segmentation. After a sentence has been segmented into tokens, you can pass them into Simhash
function.
Oh, I get it. Thanks for your explanation!
Welcome. I'll close this issue. 😄