atlanhq / camelot

Camelot: PDF Table Extraction for Humans

Home Page:https://camelot-py.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Getting PyPDF2 error while using the camelot library

shivambaldha1 opened this issue · comments

recently PyPDF2 version was updated, and now while using the Camelot I am getting an error while I use the read_csv module,

image

please fix this issue.

pip uninstall PyPDF2===1.26.0, Camelot require PyPDF2>=1.26.0 as mentioned in requirements.txt

yes, I know this thing but PyPDF has updated so PyPDF changes some of the functions.

I just installed camelot using pip, according to the installation instructions, and I am getting the same error when I use camelot CLI:

$ camelot --output test.out -f json lattice cs1.pdf
Traceback (most recent call last):
  File "/home/ian/.local/bin/camelot", line 8, in <module>
    sys.exit(cli())
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/click/decorators.py", line 84, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/cli.py", line 204, in lattice
    tables = read_pdf(
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/io.py", line 113, in read_pdf
    tables = p.parse(
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/handlers.py", line 172, in parse
    self._save_page(self.filepath, p, tempdir)
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/handlers.py", line 111, in _save_page
    infile = PdfFileReader(fileobj, strict=False)
  File "/home/ian/.local/lib/python3.9/site-packages/PyPDF2/_reader.py", line 1974, in __init__
    deprecation_with_replacement("PdfFileReader", "PdfReader", "3.0.0")
  File "/home/ian/.local/lib/python3.9/site-packages/PyPDF2/_utils.py", line 369, in deprecation_with_replacement
    deprecation(DEPR_MSG_HAPPENED.format(old_name, removed_in, new_name))
  File "/home/ian/.local/lib/python3.9/site-packages/PyPDF2/_utils.py", line 351, in deprecation
    raise DeprecationError(msg)
PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.

I solved the problem with:

$ pip uninstall PyPDF2
$ pip install PyPDF2==2.12.1

This rolls PyPDF2 back to version 2.12.1 which is the last version before 3.0.0 which deprecated many features according to the PyPDF2 Change Log

Doesn't look like camelot has had any commits in 5 years. I wouldn't count on this being fixed unless someone does it in a fork. The current version should have requirements fixed to say
PyPDF2>=1.26.0,<=2.12.1

Same issue, illtry pip uninstall PyPDF2===1.26.0

You can always work with a virtual environment with PyPDF2==2.12.1 for example.

No need to uninstall the current version from your system.