jaywyawhare / PDFIntellect

Streamline PDF data retrieval with PDFIntellect, harnessing the intelligence of LLMs via an intuitive Streamlit interface.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PDFIntellect: Smart PDF Data Retrieval

Introduction

PDFIntellect is a Streamlit app designed for smart PDF data retrieval. This app leverages Language Models (LLMs) to efficiently extract valuable information from PDF documents.

Features

  • Advanced PDF parsing.
  • Integration with pre-trained Language Models.
  • Customizable cascading LLMs.
  • Intelligent short answer generation.

Requirements

  • Python 3.7 or higher.
  • Streamlit, transformers, torch, and pdfplumber libraries.
  • Access to pre-trained Language Models.
  • PDF parsing libraries.
  • PDF documents for extraction.

Usage

  1. Clone the repository.

    git clone  https://github.com/jaywyawhare/PDFIntellect
  2. Install the required libraries.

    pip install -r requirements.txt
  3. Run the Streamlit app.

    streamlit run app.py

App will be available at http://localhost:8501.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Licence license.

About

Streamline PDF data retrieval with PDFIntellect, harnessing the intelligence of LLMs via an intuitive Streamlit interface.

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 100.0%