cpacker / MemGPT

Create LLM agents with long-term memory and custom tools πŸ“šπŸ¦™

Home Page:https://memgpt.readme.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature Request] Support for local LLMs like Ollama

mmmeff opened this issue Β· comments

commented

title

The ability to use local LLMs would be great.

added to the roadmap!

using it withe LM studios local api server would be great spent half the day trying to get it to connect but alas no good results it should be possible since the server is suppose to be a drop in replacement for openai api https://lmstudio.ai/

LM Studio is interesting but in keeping with the spirit of open-source, a better solution would be https://github.com/go-skynet/LocalAI, a fully open drop-in OpenAI API replacement that includes support for functions. I am cloning memGPT now and have a localAI installation so perhaps I can see this weekend what would be required.

Support for local LLMs would be a game changer, in particular being able to use Mistral 7B

using it withe LM studios local api server would be great spent half the day trying to get it to connect but alas no good results it should be possible since the server is suppose to be a drop in replacement for openai api https://lmstudio.ai/

Any luck on running with LM Stuido?

using it withe LM studios local api server would be great spent half the day trying to get it to connect but alas no good results it should be possible since the server is suppose to be a drop in replacement for openai api https://lmstudio.ai/

Any luck on running with LM Stuido?

I have LM Studio And Im trying to figure this out but its so confusing. If anyone out there has able to get any Llama versions to run with MemGPT that would be Helpful.

commented

added to the roadmap!

Thank you!

We are actively working on this (allowing pointing MemGPT at your own hosted LLM backend that supports function calling), more updates to come soon.

commented

sorry, for asking obvious questions - isn't it possible to just start local OpenAI API, following llama.cpp-python bindings documentation, or bringing it up with LMStudio and override OPENAI_API_ENDPOINT environment variable or something like that?

We are actively working on this (allowing pointing MemGPT at your own hosted LLM backend that supports function calling), more updates to come soon.

Can't wait to see what comes of your proposed Mistral 7B fine-tune, I hope it's intended use is to allow for system ai interdependence and release constrait to external OpenAI processing... I imagine a model that could call a subject matter expert model into vram for specific questioning, or just be able to conduct web research and put together reports of their own accord. Organize the data into its own fine tune safensor or lora depending on your AI core update interval... the future is coming.

commented

Ok, so the idea is not fine-tune some model to be more aligned to call MemGPT functions? But where did you get that info? Please, share.

@d0rc

We are doing both:

  1. Adding official support for using your own LLM backend that supports function calling (this can be as simple as setting the openai.api_base property to point towards your server if the backend if configured properly, but we want to add better support for this with examples and some reference models). This will also make is easier for the community to try new function calling LLMs with MemGPT (since new ones are getting released quite frequently) to see which work best.
  2. Working on our own finetuned models that are finetuned specifically for MemGPT functions (with the idea that these should hopefully perform better than open models finetuned on general function call data, and thus help approach the performance of MemGPT+gpt4).

This issue is for tracking (1), and discussion for (2) is here: #67 (though the content of the two threads is overlapping).

made a PR for this: #86

OPENAI_API_BASE=http://localhost:8080/v1 python main.py --persona syn.txt --model wizardcoder-python-34b.gguf
Running... [exit by typing '/exit']
Warning - you are running MemGPT with wizardcoder-python-34b.gguf, which is not officially supported (yet). Expect bugs!
πŸ’­ Bootup sequence complete. Persona activated. Testing messaging functionality.
Hit enter to begin (will request first MemGPT message)hello!
πŸ’­ None
πŸ€– Hello, Chad! I'm Synthia. How can I assist you today?
Hi Syn, I am Matt.
and so on...

Hahaha, fantastic! Yeah, using LocalAI (single docker command and I have models lying all over the place but if that wasn't the case, LocalAI project can pull them from HuggingFace or Model Gallery that they have setup automagically at runtime)

I started off a little rocky as I spent the majority of my time on FreeBSD getting memGPT going (I will file a pr if I cant get it) but moved to a Linux box to see some forward motion and to check if one can indeed just change the endpoint on a properly config'd backend and sail away. Yes, you sure can! My first try or 2, I didn't have large enough context window, typo'd my model template, etc., but once I stopped spazzing out, it fired right up and started working straight away. Yay! Nice project, kudos you guys and great paper btw. Congrats!

Here's a horrible first proof of life video before I chop it into an actual success video later:
http://demonix.io:9000/index.php?p=&view=memgpt-localai.mp4

Testing with LM Studio.

OPENAI_API_BASE=http://localhost:1234/v1 python3 main.py

[2023-10-22 11:50:02.528] [ERROR] Error: 'messages' array must only contain objects with a 'role' field that is either 'user', 'assistant', or 'system'.

OPENAI_API_BASE=http://localhost:8080/v1 python main.py --persona syn.txt --model wizardcoder-python-34b.gguf Running... [exit by typing '/exit'] Warning - you are running MemGPT with wizardcoder-python-34b.gguf, which is not officially supported (yet). Expect bugs! πŸ’­ Bootup sequence complete. Persona activated. Testing messaging functionality. Hit enter to begin (will request first MemGPT message)hello! πŸ’­ None πŸ€– Hello, Chad! I'm Synthia. How can I assist you today? Hi Syn, I am Matt. and so on...

Hahaha, fantastic! Yeah, using LocalAI (single docker command and I have models lying all over the place but if that wasn't the case, LocalAI project can pull them from HuggingFace or Model Gallery that they have setup automagically at runtime)

I started off a little rocky as I spent the majority of my time on FreeBSD getting memGPT going (I will file a pr if I cant get it) but moved to a Linux box to see some forward motion and to check if one can indeed just change the endpoint on a properly config'd backend and sail away. Yes, you sure can! My first try or 2, I didn't have large enough context window, typo'd my model template, etc., but once I stopped spazzing out, it fired right up and started working straight away. Yay! Nice project, kudos you guys and great paper btw. Congrats!

Here's a horrible first proof of life video before I chop it into an actual success video later: http://demonix.io:9000/index.php?p=&view=memgpt-localai.mp4

I tried too, unable to get it right.
'OPENAI_API_BASE' is not recognized as an internal or external command,
operable program or batch file.

Any assistant on this will be great on using LM Studio.

I'm on Mac OS 14.0 Sonoma with an M2.

I was able to get the llama.cpp server working with

  • the llama.cpp/examples/server/api_like_OAI.py file
  • the llama.cpp/server file

The problem I ran into was I didn't find a model that supported function calling yet.

Some of the steps I took are:

  1. export OPENAI_API_KEY=123456
  2. export OPENAI_REVERSE_PROXY=http://127.0.0.1:8081/v1/chat/completions (maybe?)
  3. python api_like_OAI.py --api-key 123456 --host 127.0.0.1 --user-name "user" --system-name "assistant"
  4. ./server -c 4000 --host 0.0.0.0 -t 12 -ngl 1 -m models/airoboros-l2-13b-3.1.1.Q4_K_M.gguf --embedding --alias gpt-3.5-turbo -v

I tried too, unable to get it right. 'OPENAI_API_BASE' is not recognized as an internal or external command, operable program or batch file.

Any assistant on this will be great on using LM Studio.

Just a note to say the OPENAI_API_BASE=host:port is just a way to set an environment variable when you run the python command. MemGPT must check for it and swap the api base url.