easonlai / chat_with_pdf_table

The contents of this repository showcase how to extract table data from a PDF file and preprocess it to facilitate word embedding. This preprocessing step enhances the readability of table data for language models and enables us to extract more contextual information from the tables.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chat with PDF with the Tables

The contents of this repository showcase how to extract table data from a PDF file and preprocess it to facilitate word embedding. This preprocessing step enhances the readability of table data for language models and enables us to extract more contextual information from the tables. The PyMuPDF library was utilized to identify and extract tables from the PDF document.

Enjoy!

About

The contents of this repository showcase how to extract table data from a PDF file and preprocess it to facilitate word embedding. This preprocessing step enhances the readability of table data for language models and enables us to extract more contextual information from the tables.


Languages

Language:Jupyter Notebook 100.0%