How can you use Stacey with a local model instead of Openai?

Question

How can you use Stacey with a local model instead of Openai?

CRCODE22 opened this issue 8 months ago · comments

I saw a comment from Dave on Youtube that ACE_Framework can be used with local models but it appears the Stacey demo is not capable yet of running on a local-llm.

I have already setup https://github.com/Josh-XT/Local-LLM/
Local-LLM is a llama.cpp server in Docker with OpenAI Style Endpoints.

Instead of it going online to OpenAI how can you make it go to http://localhost:8091/v1 ?

Or is there an easier way of making it work with local models?

chymian · Answer 1 · Sat Dec 02 2023 02:07:29 GMT+0800 (China Standard Time)

+1
unfortunately, the following is not yet true :(

Exclusively Local Hardware

No cloud or SaaS providers. This constrains the project to run locally on servers, desktops, laptops, smart home, and portable devices.

thehunmonkgroup · Answer 2 · Sat Dec 02 2023 02:25:49 GMT+0800 (China Standard Time)

I think a good starting compromise is to use OS model SaaS endpoints, like https://www.anyscale.com/endpoints

chymian · Answer 3 · Sat Dec 02 2023 17:40:32 GMT+0800 (China Standard Time)

the thing is more about, how to configure stacey to not use api.openai.com and instead use local/other API-endpoints.
It is possible to set WEAVIATE_URL=http://localhost:8080 in .env, but not the API_BASE_URL

@thehunmonkgroup, thanks for the tipp with anyscale.com, I didn't knew them - yet.

CRCODE22 · Answer 4 · Sun Dec 03 2023 05:54:49 GMT+0800 (China Standard Time)

This looks to me a very good starting point: https://localai.io/features/openai-functions/

LocalAI supports running OpenAI functions with llama.cpp compatible models.

OpenAI functions are available only with ggml or gguf models compatible with llama.cpp.

You don’t need to do anything specific - just use ggml or gguf models.

CRCODE22 · Answer 5 · Sun Dec 03 2023 05:58:23 GMT+0800 (China Standard Time)

I think a good starting compromise is to use OS model SaaS endpoints, like https://www.anyscale.com/endpoints

No because then it still is not running fully locally and it costs money. I just posted a suggestion that can run locally with ggml and gguf models. The whole idea behind the ACE Framework is that you can run it completely locally (Exclusively local hardware).

"Look at what Dave wrote about ACE-Framework
Project Principles
Exclusively Open Source

We will be committed to using 100% open source software (OSS) for this project. This is to ensure maximimum accessibility and democratic access.
Exclusively Local Hardware

No cloud or SaaS providers. This constrains the project to run locally on servers, desktops, laptops, smart home, and portable devices.
"
https://localai.io/features/openai-functions/

LocalAI supports running OpenAI functions with llama.cpp compatible models.

OpenAI functions are available only with ggml or gguf models compatible with llama.cpp.

You don’t need to do anything specific - just use ggml or gguf models.

John I. Wallin III · Answer 6 · Wed Dec 06 2023 03:08:43 GMT+0800 (China Standard Time)

if you have not already, i recommend you check our Ollama's framework for serving local models from their library or from pre-trained local (Ollama.ai, https://github.com/jmorganca/ollama). I have a linux/wsl2 cli app repo that I just published here (https://github.com/3JulietAI/chat3j) with some pretty cool features and methods. Happy to submit my project to contribute here, hopefully there is something in there that would be useful to someone.

chymian · Answer 7 · Fri Feb 23 2024 18:15:09 GMT+0800 (China Standard Time)

hello @thehunmonkgroup ,
is that project still allive?
and any working answer for this issue?

thehunmonkgroup · Answer 8 · Sat Feb 24 2024 01:57:04 GMT+0800 (China Standard Time)

It's an open source project, you're welcome to submit a PR to add the functionality you're wanting.