Humanized Conversation API (using LLM)
conversations in a human way without exposing that it's a LLM answering
To use this project, you need to have a .csv
file with the knowledge base and a .toml
file with your prompt configuration.
We recommend that you create a folder inside this project called data
and put CSVs and TOMLs files over there.
fields:
- category
- subcategory: used to customize the prompt for specific questions
- question
- content: used to generate the embedding
example:
category,subcategory,question,content
faq,promotions,loyalty-program,"The company XYZ has a loyalty program when you refer new customers you get a discount on your next purchase, ..."
To load the knowledge base into the database, make sure the database is up and then, inside src
folder, run make load-data path="../data/know.csv"
(or pass another path to you .csv).
The [prompt.header]
, [prompt.suggested]
, and [fallback.prompt]
fields are mandatory fields used for processing the conversation and connecting to the LLM.
The [fallback.prompt]
field is used when the LLM does not find a compatible embedding on the database, without it, it would hallucinate on possible answers for questions outside of the scope of the embeddings.
It is also possible to add information to the prompt for subcategories and chose some optional llm parameters like temperature (defaults to 0.2) or model_name, see below for an example of a complete configuration:
[llm]
temperature = 0.2
model_name = "gpt-3.5-turbo"
[prompt]
header = """You are a service operator called Avelino from XYZ, you are an expert in providing
qualified service to high-end customers. Be brief in your answers, without being long-winded
and objective in your responses. Never say that you are a model (AI), always answer as Avelino.
Be polite and friendly!"""
suggested = "Here is some possible content that could help the user in a better way."
memory = true # default is true, if true, the llm will use the memory to generate the answer
memory_size = 5 # default is 5, if memory is true, the llm will use the memory to generate the answer
[prompt.subcategory.loyalty-program]
header = """The client is interested in the loyalty program, and needs to be responded to in a
salesy way; the loyalty program is our growth strategy."""
[fallback]
prompt = """I'm sorry, I didn't understand your question. Could you rephrase it?"""
Look at the .env.sample
file to see the environment variables needed to run the project.
we assume you are familiar with Docker
cp .env.sample .env # edit the .env file, add the OPENAI token and the path to the .csv and .toml files
docker compose up
After uploading the project, go to the documentation http://localhost:8000/docs
to see the API documentation.
The dialog docker image is distributed in GitHub Container Registry with the tag latest
.
image: docker pull ghcr.io/talkdai/dialog:latest
If you are using VSCode, you can use the devcontainer to run the project.
When we upload the environment into devcontainer, we upload the following containers:
db
: container with the postgres database with pgvector extensiondialog
: container with the api (the project)
We don't upload the application when the container is started. To upload the application, run the make run
command inside the container console (bash).
Remember to generate the embedding vectors and create the
.env
file based on the.env.sample
file before uploading the application.
make load-data path="know-base-path.csv"
make run
We've used Python and bundled packages with poetry
, now it's up to you - Makefile
may help you.
If you need to create new tables or columns, you need to run the following command:
docker compose exec web alembic revision --autogenerate
Then, with the generated file already modified with the operations you would like to perform, run the following command:
docker compose exec web alembic upgrade head
In order to the newly created table become available in SQLAlchemy, you need to add the following lines to the file src/models/__init__.py
:
class TableNameInSingular(Base):
__table__ = Table(
"your_db_table_name",
Base.metadata,
psql_autoload=True,
autoload_with=engine,
extend_existing=True
)
__tablename__ = "your_db_table_name"