polm / fugashi

A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Windows DLL Weirdness

polm opened this issue · comments

Via email I have a report of a Windows user who installed fugashi via pip without errors, but didn't get libmecab.dll in their site-packages/fugashi directory, which led to errors at import time like this:

ImportError: DLL load failed while importing fugashi: the specified module could not be found

For what it's worth, the dll is definitely in the wheel file, and when I install it on Windows the dll ends up in the site-packages/fugashi package as expected.

This thread has some info on DLLs and Python on Windows:

Toblerity/Fiona#851

One thing that we could potentially do is check for ImportErrors, and if the code is being executed on Windows, check if libmecab.dll is present and give a very specific error if not. On the other hand, since it's not clear how this happened in the first place, maybe just having an FAQ entry (or this issue) is enough for now.

Tested on my Win 10 pro, with Python 3.9 (from Windows Store) and a venv. Worked with no issues, both with or without wheel package 👍

Closing because no action is required for the time being.

I also encountered this error while using Python 3.8 (from Windows Store). Where can I find the fugashi DLL to manually download it?

@alinacoding You can download it from PyPI.

https://pypi.org/project/fugashi/#files

Can you explain how you installed fugashi? Did you use pip, was it in Powershell or something else, etc. This really shouldn't happen so any hints are helpful.

Note that maybe installing in a venv works, but installing globally does not. Not really sure though. (In general I would recommend always using venvs.)

I used pip to install it on a Windows 64-bit machine, after I got the following error trace:

Traceback (most recent call last):
  File "chatbot.py", line 341, in <module>
    main()
  File "chatbot.py", line 292, in main
    bjsa = BertJapaneseSentimentAnalyzer()
  File "chatbot.py", line 48, in __init__
    self.tokenizer = BertJapaneseTokenizer.from_pretrained('cl-tohoku/bert-base-japanese-whole-word-masking')
  File "C:\Users\avoic\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1719, in from_pretrained
    return cls._from_pretrained(
  File "C:\Users\avoic\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1792, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "C:\Users\avoic\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 151, in __init__
    self.word_tokenizer = MecabTokenizer(
  File "C:\Users\avoic\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 231, in __init__
    import fugashi
  File "C:\Users\avoic\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fugashi\__init__.py", line 1, in <module>
    from .fugashi import *
ImportError: DLL load failed while importing fugashi: The specified module could not be found.

On a Ubuntu 16.04 64-bit machine the installation via pip went smoothly.
On Windows I am also having trouble installing the mecab-python-windows package, since it complains about missing mecab.h file.

Thank you for your suggestion, I will try using a venv.

Thanks for the extra info! It definitely looks like pip is just being weird, no idea why that would happen.

On Windows I am also having trouble installing the mecab-python-windows package, since it complains about missing mecab.h file.

That package is not maintained - the author moved development to mecab on PyPI. They mention the error you saw in their announcement post here. In general there is not much difference between their package and mecab-python3 except that mecab-python3 includes the MeCab binary on all platforms so it shouldn't require an extra install, though that might not help if you keep having this pip issue.