Documenting Setup on Linux (almost made it)

Question

Documenting Setup on Linux (almost made it)

whoabuddy opened this issue 5 months ago · comments

Stumbled across the repo and was interested in trying out the assistant concept with some beefed up local model settings.

My env is quite different than what's used here and my goal was to document my journey to help with docs.

Unfortunately, I couldn't get the UI to communicate correctly with my local LLM, and ran out of the extra time I had to play with it.

Everything below is an artifact of that process, in case it helps someone else!

Stumbled across this and wanted to share my manual setup as my local env is a little different:

using Linux Mint (Ubuntu-based)
using https instead of ssh for git
using conda instead of venv
using torch w/ GPU support
using Miqu-70b w/ 32k context through Textgen Web UI
using npm instead of yarn (managed through nvm)

Backend setup:

git clone https://github.com/AndrewVeee/nucleo-ai.git
cd nucleo-ai
conda create -n nucleo python=3.11
conda activate nucleo
pip install torch
pip install -r backend/requirements.txt 
cp sample/config-sample.toml data/config.toml

Modified config:

server_host = "127.0.0.1"
server_port = 4742
log_level = 3

[llm]
name = 'local-model'
default = true
type = 'openai'

# Set this to the port of your local instance or update to your API service and key.
openai_base_url = 'http://localhost:5000/v1'
openai_api_key = 'none'
openai_model = 'gpt-3.5-turbo'

# NOTE: Since a proper tokenizer isn't used, you should set this to about 2/3 of your
# actual max context size.
context_size = 21000

# Maximum number of completions at a time.
# For local servers (llama.cpp, oobabooga, etc), this should be set to 1, otherwise 
# it might cut off a response to start a new one.
# If you're using an API/serving infrastructure, you can set this higher.
max_concurrent = 1

[embed]
# If you change the embedding model, change this name so Chroma will keep working.
db_name = "bge-large-en-v1.5"
embed_model = "BAAI/bge-large-en-v1.5"
rank_model = "BAAI/bge-reranker-large"

Frontend setup:

Note: Just running npm install failed due to dependency issues between vite and vite-plugin-singlefile. I could've forced it but explored just updating the minor versions of the packages with npm-check-updates first:

$ cd frontend
$ npx npm-check-updates --target minor -u
Upgrading /home/ash/futuristic/nucleo-ai/frontend/package.json
[====================] 7/7 100%

 sass                   ^1.69.5  →  ^1.71.1
 vite                    ^2.0.5  →  ^2.9.17
 vite-plugin-vue2        ^2.0.1  →   ^2.0.3
 vue-template-compiler  ^2.7.14  →  ^2.7.16

After the minor upgrade this worked to get it running and accessible on my local network.

Warning

Using the code below will expose the UI to any network connection. Do not use with internet-facing services unless you know what you're doing.

npm install --force
npx vite serve --host 0.0.0.0 --port 4743

The UI was working but had white bg + white text so was really hard to read.

Switched to dark mode, everything looks good but was showing "Connection Error" top right and the settings didn't pick up the correct IP.

Narrowed it down to setting my local IP plus port 5000 for the OpenAI API. The connection error went away.

Chats still weren't working though, and after deleting something I'm stuck with an undefined list that I can't add to.

I also see this error in the console running vite, looks like it's using port 4742 for something?

  vite v2.9.17 dev server running at:

  > Local:    http://localhost:4743/
  > Network:  http://192.168.0.178:4743/

  ready in 164ms.

3:47:46 AM [vite] http proxy error:
Error: connect ECONNREFUSED 127.0.0.1:4742
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1555:16)
3:47:46 AM [vite] http proxy error:
Error: connect ECONNREFUSED 127.0.0.1:4742
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1555:16) (x2)
3:47:46 AM [vite] http proxy error:
Error: connect ECONNREFUSED 127.0.0.1:4742
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1555:16) (x3)
3:47:46 AM [vite] http proxy error:
Error: connect ECONNREFUSED 127.0.0.1:4742
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1555:16) (x4)
3:47:46 AM [vite] http proxy error:
Error: connect ECONNREFUSED 127.0.0.1:4742
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1555:16) (x5)

Maybe next time!

Jason Schrader · Answer 1 · Fri Feb 23 2024 18:52:51 GMT+0800 (China Standard Time)

Linking #8 since I also saw white on white

Andrew · Answer 2 · Fri Feb 23 2024 22:32:48 GMT+0800 (China Standard Time)

Sorry for the trouble and thanks for the detailed feedback!

Once you run start.sh, it should show you the host/port it's running on, and that includes the pre-built frontend. The frontend directory is only needed for development.

I think the readme doesn't make that clear and I'll update it.

And thanks for pointing out the white text issue. I'll try to figure that out today!

Andrew · Answer 3 · Fri Feb 23 2024 23:13:33 GMT+0800 (China Standard Time)

Updated Step 3 in the setup section to:

Step 3
Run ./start.sh to start the app. The first run will take a bit to download the SentenceTransformers models for RAG support.

Once the app has started, you will see a line like:

 * Running on http://127.0.0.1:4742

Open the link in your browser to start using Nucleo!

Hope this helps future users. I also think I fixed the white text issue.