jalan / pdftotext

Simple PDF text extraction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't import pdftotext in my Mac Apple Silicon M1

anprieto opened this issue · comments

Hi,

I changed my mac to the new Apple M1 Pro and now I can't import pdftotext in python, so I can't make my previous code work.

Are you experiencing the same problem?

I opened a question in stackexchange with details:

https://stackoverflow.com/questions/71371871/cant-import-pdftotext-in-python-in-my-mac-m1

How are you installing python, poppler, and pdftotext? I think most people use brew these days, in which case I believe everything works on M1 Macs, thanks to #86. But I don't have an M1 Mac to do any testing.

This is what I do:

  1. Install python 3.10
  2. Install command line developer tools
  3. pip3 install pdftotext from terminal
  4. Open IDLE, type import pdftotext
  5. I get this error:
    Traceback (most recent call last): File "<pyshell#9>", line 1, in import pdftotext ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pdftotext.cpython-310-darwin.so, 0x0002): symbol not found in flat namespace '_ZN7poppler24set_debug_error_functionEPFvRKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEEPvES9'

I'm not super expert, and I don't understand the difference with using brew or not. With my previous mac I ended up with multiple versions of packages. I'll try though

I have the same issue as well on an intel macbook

Folks, if you're going to post here at all, please post details that might help someone to help you. Or at least answer my questions when I ask for details about how you have installed each component.

As an example, if the question is "How are you installing python, poppler, and pdftotext?", then "I install python 3.10" is not really answering the question, is it?

Hi jalan,

Let me be more specific:
I installed python 3.10, by downloading it from the official page (https://www.python.org/downloads/) and just executing. Nothing more. Mostly I use IDLE to develop.
I installed Command Line tools, from the apple store.
I don't use brew, that's why I mentioned I install pdftotext as "pip3 install pdftotext from terminal".
There is no more to it. It just does not work. Maybe with brew does, but I don't like that option.
I finally ended up using an alternative, Tika parser. Its output it's similar to pdftotext.
I'm not an advanced developer, so I cannot say anything more unless the question is more specific. Maybe other M1 users with more experience have the same problem and can help.

I was able to to do it by upgrading my python to 3.10. I went to the link https://www.python.org/downloads/ and proceeded to install pdftotext using pip3 install pdftotext and it worked! I was using python 3.8 previously and it would not work.

I don't think there's anything to fix here. See #100 for an example of everything working on an M1 Mac.