find_tables doesn't recognize any table in scanned document

Question

find_tables doesn't recognize any table in scanned document

RodrigoTomeES opened this issue a month ago · comments

Description of the bug

Hi,

We are trying to get the tables of this scanned document as this but find_tables doesnt recognize any one. We tried to use also page.find_tables(horizontal_strategy="text", vertical_strategy="text") as mention here.

How to reproduce the bug

import fitz  # import package PyMuPDF
import os

# Open some document, for example a PDF (could also be EPUB, XPS, etc.)
doc = fitz.open("RE2.pdf")

for page in doc:
  # Look for tables on this page and display the table count
  tabs = page.find_tables()

  for table in tabs.tables:
    print(table.to_pandas())
  print(f"{len(tabs.tables)} table(s) on {page}")

# We will see a message like "1 table(s) on page 0 of input.pdf"

PyMuPDF version

1.24.2

Operating system

Linux

Python version

3.10

Jorj X. McKie · Answer 1 · Mon Apr 22 2024 01:28:16 GMT+0800 (China Standard Time)

As answered in Discord: no such can-do!