doc-insights

How it works

Create a chat engine with LlamaIndex to answer question based on a set of pre-selected documents.
Leverage Streamlit for file uploads and interactive communication with the engine.

Deployment

Clone the repo
You can run the docker-compose command to launch the app with docker containers, and then type a question in the chat interface.

docker-compose up --build

Integration with Xinference

Start Xinference cluster

xinference --log-level debug

Launch an Embedding model and a LLM model, get their model_uids. For example, launching bge-large-zh (embedding) and chatglm3 (LLM):

from xinference.client import Client
client = Client("http://127.0.0.1:9997")
model_uid = client.launch_model(model_name="bge-large-zh", model_type="embedding")
model_uid2 = client.launch_model(model_name="chatglm3", quantization=None, model_format='pytorch', model_size_in_billions=6)
print(model_uid, model_uid2)

Modify docker-compose.yml using the above model_uids, for example:

version: "2"

services:

  app:
    build: .
    network_mode: "host"
    ports:
      - "8501:8501"
    volumes:
      - ./app:/app/app
    environment:
      - LLM=xinference
      - EMBEDDING=xinference
      - XINFERENCE_SERVER_ENDPOINT=http://127.0.0.1:9997
      - XINFERENCE_EMBEDDING_MODEL_UID=<model_uid>
      - XINFERENCE_LLM_MODEL_UID=<model_uid2>
      - HISTORY_KEEP_CNT=10

Deploy this application:

docker-compose up --build

Run the app

In you want to run a local dev environment, the following command will let you test the application with OpenAI API.

poetry install
LLM=openai EMBEDDING=openai streamlit run app/main.py

Troubleshooting

If you want to use OpenAI, check that you've created an .env file that contains your valid (and working) API keys.

xorbitsai / doc-insights

doc-insights

How it works

Deployment

Integration with Xinference

Run the app

Troubleshooting

About

Languages