PDFIntellect: Smart PDF Data Retrieval

Introduction

PDFIntellect is a Streamlit app designed for smart PDF data retrieval. This app leverages Language Models (LLMs) to efficiently extract valuable information from PDF documents.

Features

Advanced PDF parsing.
Integration with pre-trained Language Models.
Customizable cascading LLMs.
Intelligent short answer generation.

Requirements

Python 3.7 or higher.
Streamlit, transformers, torch, and pdfplumber libraries.
Access to pre-trained Language Models.
PDF parsing libraries.
PDF documents for extraction.

Usage

Clone the repository.

git clone  https://github.com/jaywyawhare/PDFIntellect

Install the required libraries.
```
pip install -r requirements.txt
```
Run the Streamlit app.
```
streamlit run app.py
```

App will be available at http://localhost:8501.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Licence license.

About

Streamline PDF data retrieval with PDFIntellect, harnessing the intelligence of LLMs via an intuitive Streamlit interface.

llama2 llm nlp vector-database

GNU General Public License v3.0

Languages

Language:Jupyter Notebook 100.0%