Bug in tokenizer for Tibetian Language

Question

Bug in tokenizer for Tibetian Language

asusdisciple opened this issue a year ago · comments

At the moment you cant use tibetian language tokenizer. It gives the error message:

TypeError: "module" object is not callable

The error is thrown here in sentence_split.py:

    elif split_algo == "bodnlp":
        logger.info(f" - Tibetan NLTK sentence splitter applied to '{lang}'")
        from botok.tokenizers import sentencetokenizer as bod_sent_tok

Tuan Tran · Answer 1 · Tue Dec 05 2023 17:33:36 GMT+0800 (China Standard Time)

We have released the new version 2.1.0 last week, could you check again ?