zylon-ai / private-gpt

Interact with your documents using the power of GPT, 100% privately, no data leaks

Home Page:https://privategpt.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running private-gpt inside docker with the same definitions as non-docker behave super slow - unusable

BenBatsir opened this issue · comments

I'm trying to dockerize private-gpt

I've been successfully able to run it locally and it works just fine on my MacBook M1.

I've been also able to dockerize it and run it inside a container as a pre-step for my next steps (deploying on different hosts), but this time when trying to get a response it hangs and finally timed-out:

private-gpt-1  | 11:51:39.191 [WARNING ] llama_index.core.chat_engine.types - Encountered exception writing response to history: timed out

I did increase docker resources such as CPU/memory/Swap up to the maximum level, but sadly it didn't solve the issue.

Any idea how can I overcome this?

To address the the Apple Silicon GPU with the Metal API is for now not possible with Docker.
But you can use CUDA Toolkit to address a NVIDIA CUDA GPU with a Docker container. See #1655 (comment)

As @dpedwards said, you can't use metal with docker. The easiest and simplest way to use on a Mac, is ollama running on top of host and PrivateGPT using any profile (running on top of host or docker, as you like).