Significant-Gravitas / AutoGPT

Duplicates

I have searched the existing issues

Summary 💡

Currently the AutoGPT app assumes the underlying LLM supports OpenAI-style function calling. Even though there is a config variable OPENAI_FUNCTIONS which defaults to false, turning this on/off is a no-op. I don't think the actual value of this variable is used by any part of the system. This bug is hidden by the fact that all the supported OPEN_AI_CHAT_MODELS have has_function_call_api=True (

AutoGPT/autogpts/autogpt/autogpt/core/resource/model_providers/openai.py

Line 110 in 64f48df

OPEN_AI_CHAT_MODELS = {

)

So even when OPENAI_FUNCTIONS is turned off, during e.g. agent creation, the system still expects to be interacting with a model that supports OpenAI-style function calling. This usually isn't a problem since all of the supported models have function calling enabled, so errors never get raised.

The errors only arise when you try to use a non-OpenAI model (e.g. local model via Ollama, llamafile, etc) by setting OPENAI_API_BASE_URL=http://localhost:8080/v1. If the model doesn't support function calling (i.e. the tool_calls field of the model response is empty) you get a ValueError: LLM did not call create_agent function; agent profile creation failed from this line

It seems like there should be a happy path to delegating function calling to customizable/pluggable components instead of assuming the underlying LLM will take care of everything end-to-end. I think this would make it easier for people to use local LLMs, as well as mix local LLMs with expensive APIs. Maybe this happy path already exists -- if so, I'd be happy to write docs for this.

Related to this issue: #6336

Let me know what you think @ntindle

Examples 🌈

No response

Motivation 🔦

No response

responded on discord :)

@ntindle can you give a link to the discord discussion here for those of use arriving from the web at large? Or maybe a summarication here? I think this is a very important topic going forward, this project should not be shackled to openAI.

Absolutely. Good point. I'll try to be more diligent about that in the future too.

Link to start of discussion: https://discord.com/channels/1092243196446249134/1095817829405704305/1212845507060437033

Summary:

Do you need function calls

Yes, we need function calls. No, it doesn't have to be natively supported in the model. They can look at the OpenAIProvider with _functions_compat_fix_kwargs and _tool_calls_compat_extract_calls for a very rough idea of how it could be done.

I think the "middleware" idea does make sense.

fwiw yesterday I did manage to get the generate_agent_profile_for_task call working with a locally-running llava-v1.5-7b-q4 model after some hacking. Gonna try to see how far I get with making a full session work today and then I'd be happy to brainstorm about how to make it easier for others to do this with their own local models via middleware or more docs or both.

However one question is, is it worth investing refactoring time in the AutoGPT bot itself or is this functionality that should actually just land in the Forge part?

However one question is, is it worth investing refactoring time in the AutoGPT bot itself or is this functionality that should actually just land in the Forge part?

It makes sense to put the work into AutoGPT right now. We are planning on moving library code including autogpt.core.resource.model_providers to Forge, in the near future. The cleaner the module, the easier the move.

@Pwuts then put this on the roadmap as #7001

I wanted add in the @Wladastic 's comment. I like the idea of using multiple models, perhaps eventually a model could be trained specifically for the functions themselves.

I wanted add in the @Wladastic 's comment. I like the idea of using multiple models, perhaps eventually a model could be trained specifically for the functions themselves.

Thank you :)
I am already working on another ai project due to that.
I figured out a very good way to avoid the down sides. Multi step works amazingly well with neural chat 3.1, capybara mistral7b, gemma 2b and mixtral 8x7b. Funnily gguf models work best.
It requires a shit ton of optimization but its doable. You just have to figure out the params needed for each call

I just went and ran it myself again, I can confirm the LLM doesnt even respond in the correct format.
What does work though is the one-shot prompt. I think for using local llm's we could at least use these as a first?
It just works for a few steps only though as anything above 3000 tokens is too confusing for Mistral already

@Wladastic are you working out of a branch? Or are you able to get all this working on main? And/Or do you have a link to this other project?

@joshuacox
I wrote a smaller version, working on autogpt directly was too complicated and painful for that haha

@Wladastic I completely understand, sometimes you need to simplify things to isolate the parts you are working with. I encourage you to put up a branch or repo, it might be easier for some of us to contribute to as well.

@Wladastic I completely understand, sometimes you need to simplify things to isolate the parts you are working with. I encourage you to put up a branch or repo, it might be easier for some of us to contribute to as well.

I could try to but my project is now merged with my own ai that works different than AutoGPT right now.
I can try to make a simplified version.

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

Make it easier to use local LLMs by decoupling AutoGPT from dependence on OpenAI function calling

Duplicates

Summary 💡

Examples 🌈

Motivation 🔦