localhost:7860 not appearing in browser

Question

localhost:7860 not appearing in browser

ElliottDyson opened this issue 8 months ago · comments

All of the ports were opened appropriately when launching the docker instance:
sudo docker run -d --device /dev/dri -p 7860:7860 -p 21001:21001 -p 8000:8000 -v ~/Downloads/Models:/root/.cache/huggingface itlackey/ipex-arc-fastchat:latest

Also the worker seems to be responding just fine:

I was hoping you might be able to help so solve this issue? Also, thank you for all your great work here!

ElliottDyson · Answer 1 · Mon Nov 06 2023 07:39:58 GMT+0800 (China Standard Time)

Also, whilst something rather different, have you considered integration with llama.cpp for the purpose of the usefulness of GGUF files and their excellent use of k-means clustering for composite weightings for models (not sure if you've seen the perplexity performance at all, it's pretty crazy what you can get for a model size with those methods). I'd use something like llama-cpp-python, but unfortunately only the direct llama.cpp has the intel arc support, and have no idea about how these APIs and websockets work. Anyways, sorry for the digression.

IT Lackey · Answer 2 · Mon Nov 06 2023 08:54:58 GMT+0800 (China Standard Time)

Would you mind posting the entire log or the beginning of it? It looks like you are missing args but it's not obvious to me.
Also do you have the default model already downloaded?

As for llama.cpp, right now it only support OpenCL which barely uses the Arc at all. At least that is my understanding. The reason fast chat is.. well fast... is it using the pytorch for Intel code. Supposedly there is work being done to support vulkan drivers instead of OpenCL but until that is done it is kind of worthless on Intel from my experience

ElliottDyson · Answer 3 · Mon Nov 06 2023 09:17:01 GMT+0800 (China Standard Time)

Would you mind posting the entire log or the beginning of it? It looks like you are missing args but it's not obvious to me.
Also do you have the default model already downloaded?

As for llama.cpp, right now it only support OpenCL which barely uses the Arc at all. At least that is my understanding. The reason fast chat is.. well fast... is it using the pytorch for Intel code. Supposedly there is work being done to support vulkan drivers instead of OpenCL but until that is done it is kind of worthless on Intel from my experience

Yep, that makes sense in terms of speed, however, last I heard CLBlast has a proper implementation for making use of Intel arc now, but I couldn't get that compiled properly anyways. Luckily this project seems to support offloading to CPU anyways, that was one of my worries due to memory issues you see, but looking at the original project, it seems to have support for it.

How would I go about getting the logs for you? I'm new to these docker environments you see, I usually set everything up manually in a conda environment, but for something like this it was a bit beyond me, so I've been venturing into getting this set up from what you've provided. Again, thank you for helping.

IT Lackey · Answer 4 · Tue Nov 07 2023 04:33:16 GMT+0800 (China Standard Time)

So you will want to map a volume for the /apps folder on the docker container. That will allow the logs to be written to your files stem instead of the container.

Otherwise, you can just copy the terminal output if that is easier. Just make sure to get the start of the log too

IT Lackey · Answer 5 · Sat Nov 11 2023 06:48:25 GMT+0800 (China Standard Time)

Following up on this. I just pushed a new version of the image. Go ahead and pull the new version from docker hub and include -v /local/folder:/logs to the docker run command. This will dump the logs to a local folder you specify. Then we can take a look at what's happening.

billboyles · Answer 6 · Fri Mar 15 2024 15:16:58 GMT+0800 (China Standard Time)

I am having a similar issue running with WSL-2/Docker Desktop. I used the recommend command except i used a volume rather than a bind mount.

I can curl google from inside the container but cannot get a response either on 7860 or 8000.

volume looks like:

logs:
logs.txt

device:
Processor 11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz 3.91 GHz
Installed RAM 64.0 GB (63.8 GB usable)
System type 64-bit operating system, x64-based processor
GPU Intel Arc A770 Limited Edition 16GB

windows:
Edition Windows 11 Home
Version 22H2
OS build 22621.3296
Experience Windows Feature Experience Pack 1000.22687.1000.0

im sure im doing something stupid, just hoping you can point me in the right direction.

super hyped for this and appreciate your hard work.