maciejd / gpt-langchain-askpdf

A simple, containerized GPT-powered web apllication allowing you to query your own PDF file. Uses streamlit for UI, ChromaDB to store embeddings and langchain.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gpt-langchain-askpdf

A simple, containerized GPT-powered web apllication allowing you to query your own PDF file. Uses streamlit for UI, ChromaDB to store embeddings and langchain.

How to run it?

  1. Create .env file in root directory of the project with the following contents. Replace OpenAI key with your own.
OPENAI_API_KEY="YOUR_API_KEY"
  1. Run docker compose in detached mode docker-compose up -d
  2. Open http://localhost:8000

How does it work?

  1. Loads file using streamlit
  2. Splits pdf into chunks using langchain splitter
  3. Generates embeddings using text-embedding-ada-002
  4. Stores embeddings in an in-memory instance of ChromaDB vector database
  5. Runs a RAG chain that will rertieve relevant splits and adds them to the context of the final prompt

More info

The app leverages Retrieval-augmented generation (RAG). More info can be found here

Screenshot

app_screenshot

About

A simple, containerized GPT-powered web apllication allowing you to query your own PDF file. Uses streamlit for UI, ChromaDB to store embeddings and langchain.


Languages

Language:Python 86.9%Language:Dockerfile 13.1%