Running private-gpt inside docker with the same definitions as non-docker behave super slow - unusable

Question

Running private-gpt inside docker with the same definitions as non-docker behave super slow - unusable

BenBatsir opened this issue 5 months ago · comments

I'm trying to dockerize private-gpt

I've been successfully able to run it locally and it works just fine on my MacBook M1.

I've been also able to dockerize it and run it inside a container as a pre-step for my next steps (deploying on different hosts), but this time when trying to get a response it hangs and finally timed-out:

private-gpt-1  | 11:51:39.191 [WARNING ] llama_index.core.chat_engine.types - Encountered exception writing response to history: timed out

I did increase docker resources such as CPU/memory/Swap up to the maximum level, but sadly it didn't solve the issue.

Any idea how can I overcome this?

Davain P. Edwards · Answer 1 · Thu Apr 25 2024 17:30:24 GMT+0800 (China Standard Time)

To address the the Apple Silicon GPU with the Metal API is for now not possible with Docker.
But you can use CUDA Toolkit to address a NVIDIA CUDA GPU with a Docker container. See #1655 (comment)

Javier Martinez · Answer 2 · Wed Jul 10 2024 22:22:09 GMT+0800 (China Standard Time)

As @dpedwards said, you can't use metal with docker. The easiest and simplest way to use on a Mac, is ollama running on top of host and PrivateGPT using any profile (running on top of host or docker, as you like).