chroma-core / chroma

the AI-native open-source embedding database

Home Page:https://www.trychroma.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug]: Error when loading chromadb collection froma docker container

SHUCHISMIT12 opened this issue · comments

What happened?

I've created a docker container where I have kept my chromadb folder. While testing the docker locally the function works but when deploying on AWS Lamba it throws an error - ""errorMessage": "attempt to write a readonly database",
"errorType": "OperationalError",
"requestId": "43522161-5c70-4a9c-bc71-b8b828d1bb5d",

This is the code block where I'm getting the error -

client = chromadb.PersistentClient(path=os.getcwd() + "/chroma_db")

In the dockerfile I have copied the folder to cwd

COPY chroma_db ${LAMBDA_TASK_ROOT}/chroma_db
Please suggest

Versions

Chromadb 0.5.0
Python 3.11

Relevant log output

errorMessage": "attempt to write a readonly database",
  "errorType": "OperationalError",
  "requestId": "43522161-5c70-4a9c-bc71-b8b828d1bb5d",

@SHUCHISMIT12, I suspect a read-only filesystem is the source of your troubles. We generally do not recommend running core Chroma package in serverless functions due to the cold start and other issues like filesystems etc (as you've encountered)

@tazarov thank you for your response. Unfortunately I only have a serverless option for deployment this time. Do you suggest reading the chromadb collection from an S3 bucket into the lambda function ?

if your use case is a read-only Chroma, mounting an S3 and using that might somehow work.

When you say "only serverless option", do you exclude the possibility of having an EC2 that runs the actual Chroma server?

@tazarov, I'm currently working on a pilot project within my organisation. Initially, due to the project's limited scale, it's challenging for me to justify a separate instance solely for hosting the index. Also , hibernating the instance after each query would impact the user experience.