[Bug]: Non-multimodal LLMs fallback on GPT-4o for vision, but will continue to use 4o for future responses (including text only outputs)
returnofblank opened this issue · comments
What happened?
Chats that are non multi-modal switch to GPT-4o after an image has been sent, and remain with GPT-4o for the rest of the conversation.
Steps to Reproduce
- Create a chat with an LLM without vision capabilities, such as GPT-3.5
- Ask what model it is - it will likely state GPT-3
- Paste an image - You will see a response from GPT-4o (Seems intentional from what I've seen from the source code, specifically the comment in OpenAIClient.js for fallback vision, lines 238, although it specifies gpt-4-vision-preview, not gpt-4o)
- Ask again, what model it is - It will state that it's apart of the GPT-4 family
- Model will continue to be GPT-4o
What browsers are you seeing the problem on?
No response
Relevant log output
2024-10-02T00:27:55.247Z debug: [spendTokens] No transactions incurred against balance
2024-10-02T00:27:55.249Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2024-10-02T00:27:55.253Z debug: [AskController] Request closed
2024-10-02T00:30:38.201Z debug: [AskController]
{
text: "What model are you?",
conversationId: null,
endpoint: "OpenRouter",
endpointType: "custom",
resendFiles: true,
modelOptions.model: "openai/gpt-3.5-turbo",
modelsConfig: "exists",
}
2024-10-02T00:30:38.202Z debug: [BaseClient] Loading history:
{
conversationId: "df763a09-eda1-4d89-912f-938862891369",
parentMessageId: "00000000-0000-0000-0000-000000000000",
}
2024-10-02T00:30:38.237Z debug: [BaseClient] Context Count (1/2)
{
remainingContextTokens: 16373,
maxContextTokens: 16385,
}
2024-10-02T00:30:38.237Z debug: [BaseClient] Context Count (2/2)
{
remainingContextTokens: 16373,
maxContextTokens: 16385,
}
2024-10-02T00:30:38.237Z debug: [BaseClient] tokenCountMap:
{
3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: 9,
}
2024-10-02T00:30:38.237Z debug: [BaseClient]
{
promptTokens: 12,
remainingContextTokens: 16373,
payloadSize: 1,
maxContextTokens: 16385,
}
2024-10-02T00:30:38.238Z debug: [BaseClient] tokenCountMap
{
3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: 9,
instructions: undefined,
}
2024-10-02T00:30:38.238Z debug: [BaseClient] userMessage
{
messageId: "3692e51a-17b5-42cc-bb1b-e8ea0cbadb24",
parentMessageId: "00000000-0000-0000-0000-000000000000",
conversationId: "df763a09-eda1-4d89-912f-938862891369",
sender: "User",
text: "What model are you?",
isCreatedByUser: true,
tokenCount: 9,
}
2024-10-02T00:30:38.238Z debug: [OpenAIClient] chatCompletion
{
baseURL: "https://openrouter.ai/api/v1",
modelOptions.model: "openai/gpt-3.5-turbo",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.stream: true,
// 1 message(s)
modelOptions.messages: [{"role":"user","content":"What model are you?"}],
}
2024-10-02T00:30:38.238Z debug: [OpenAIClient] chatCompletion: dropped params
{
// 1 dropParam(s)
dropParams: ["stop"],
modelOptions.model: "openai/gpt-3.5-turbo",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.stream: true,
// 1 message(s)
modelOptions.messages: [{"role":"user","content":"What model are you?"}],
}
2024-10-02T00:30:38.239Z debug: Making request to https://openrouter.ai/api/v1/chat/completions
2024-10-02T00:30:38.242Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2024-10-02T00:30:39.140Z debug: [OpenAIClient] chatCompletion response
{
provider: "OpenAI",
object: "chat.completion",
usage.prompt_tokens: 12,
usage.completion_tokens: 15,
usage.total_tokens: 27,
id: "gen-1727829038-kLx0NMl6f04pt23YaMJA",
// 1 choice(s)
choices: [{"message":{"role":"assistant","content":"I am a language model trained by OpenAI called GPT-3."},"f... [truncated]],
created: 1727829038,
model: "openai/gpt-3.5-turbo",
}
2024-10-02T00:30:39.143Z debug: [spendTokens] conversationId: df763a09-eda1-4d89-912f-938862891369 | Context: message | Token usage:
{
promptTokens: 12,
completionTokens: 15,
}
2024-10-02T00:30:39.145Z debug: [spendTokens] No transactions incurred against balance
2024-10-02T00:30:39.147Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2024-10-02T00:30:39.151Z debug: [AskController] Request closed
2024-10-02T00:30:39.154Z debug: [OpenAIClient] chatCompletion
{
baseURL: "https://openrouter.ai/api/v1",
modelOptions.model: "meta-llama/llama-3-70b-instruct",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.temperature: 0.2,
modelOptions.presence_penalty: 0,
modelOptions.frequency_penalty: 0,
modelOptions.max_tokens: 16,
// 1 message(s)
modelOptions.messages: [{"role":"system","content":"Please generate a concise, 5-word-or-less title for the conversation, us... [truncated]],
}
2024-10-02T00:30:39.155Z debug: [OpenAIClient] chatCompletion: dropped params
{
// 1 dropParam(s)
dropParams: ["stop"],
modelOptions.model: "meta-llama/llama-3-70b-instruct",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.temperature: 0.2,
modelOptions.presence_penalty: 0,
modelOptions.frequency_penalty: 0,
modelOptions.max_tokens: 16,
// 1 message(s)
modelOptions.messages: [{"role":"system","content":"Please generate a concise, 5-word-or-less title for the conversation, us... [truncated]],
}
2024-10-02T00:30:39.155Z debug: Making request to https://openrouter.ai/api/v1/chat/completions
2024-10-02T00:30:40.903Z debug: [OpenAIClient] chatCompletion response
{
id: "gen-1727829039-OiHVxt93pYJDrfUlV20t",
provider: "DeepInfra",
model: "meta-llama/llama-3-70b-instruct",
object: "chat.completion",
created: 1727829039,
// 1 choice(s)
choices: [{"logprobs":null,"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"GPT-3 Lan... [truncated]],
usage.prompt_tokens: 89,
usage.completion_tokens: 7,
usage.total_tokens: 96,
}
2024-10-02T00:30:40.903Z debug: [spendTokens] conversationId: df763a09-eda1-4d89-912f-938862891369 | Context: title | Token usage:
{
promptTokens: 83,
completionTokens: 7,
}
2024-10-02T00:30:40.903Z debug: [OpenAIClient] Convo Title: GPT-3 Language Model Introduction
2024-10-02T00:30:40.904Z debug: [saveConvo] api/server/services/Endpoints/openAI/addTitle.js
2024-10-02T00:30:40.906Z debug: [spendTokens] No transactions incurred against balance
2024-10-02T00:31:23.553Z debug: [AskController]
{
text: "What model are you?",
conversationId: "df763a09-eda1-4d89-912f-938862891369",
endpoint: "OpenRouter",
endpointType: "custom",
resendFiles: true,
modelOptions.model: "openai/gpt-3.5-turbo",
modelsConfig: "exists",
attachments: [object Promise],
}
2024-10-02T00:31:23.554Z debug: [BaseClient] Loading history:
{
conversationId: "df763a09-eda1-4d89-912f-938862891369",
parentMessageId: "f87e5ba3-e155-48ab-b81b-ff5128ee3e76",
}
2024-10-02T00:31:23.557Z debug: [BaseClient] Context Count (1/2)
{
remainingContextTokens: 15581,
maxContextTokens: 16385,
}
2024-10-02T00:31:23.557Z debug: [BaseClient] Context Count (2/2)
{
remainingContextTokens: 15581,
maxContextTokens: 16385,
}
2024-10-02T00:31:23.557Z debug: [BaseClient] tokenCountMap:
{
3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: 12,
f87e5ba3-e155-48ab-b81b-ff5128ee3e76: 15,
891d2dc4-8e16-42f6-86ae-88aa19b097a3: 774,
}
2024-10-02T00:31:23.557Z debug: [BaseClient]
{
promptTokens: 804,
remainingContextTokens: 15581,
payloadSize: 3,
maxContextTokens: 16385,
}
2024-10-02T00:31:23.557Z debug: [BaseClient] tokenCountMap
{
3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: 12,
f87e5ba3-e155-48ab-b81b-ff5128ee3e76: 15,
891d2dc4-8e16-42f6-86ae-88aa19b097a3: 774,
instructions: undefined,
}
2024-10-02T00:31:23.559Z debug: [BaseClient] userMessage
{
messageId: "891d2dc4-8e16-42f6-86ae-88aa19b097a3",
parentMessageId: "f87e5ba3-e155-48ab-b81b-ff5128ee3e76",
conversationId: "df763a09-eda1-4d89-912f-938862891369",
sender: "User",
text: "What model are you?",
isCreatedByUser: true,
// 1 image_url(s)
image_urls: [{"type":"image_url","image_url":{"url":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAoAAAAKACAIAAA... [truncated]],
tokenCount: 774,
}
2024-10-02T00:31:23.561Z debug: [BaseClient] Skipping 3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: already had a token count.
2024-10-02T00:31:23.561Z debug: [BaseClient] Skipping f87e5ba3-e155-48ab-b81b-ff5128ee3e76: already had a token count.
2024-10-02T00:31:23.562Z debug: [OpenAIClient] chatCompletion
{
baseURL: "https://openrouter.ai/api/v1",
modelOptions.model: "openai/chatgpt-4o-latest",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.stream: true,
// 3 message(s)
modelOptions.messages: [{"role":"user","content":"What model are you?"},{"role":"assistant","content":"I am a language model trained by OpenAI called GPT-3."},{"role":"user","content":[{"type":"text","text":"What model are you?"},{"type":"image_url","image_ur... [truncated]],
}
2024-10-02T00:31:23.565Z debug: [OpenAIClient] chatCompletion: dropped params
{
// 1 dropParam(s)
dropParams: ["stop"],
modelOptions.model: "openai/chatgpt-4o-latest",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.stream: true,
// 3 message(s)
modelOptions.messages: [{"role":"user","content":"What model are you?"},{"role":"assistant","content":"I am a language model trained by OpenAI called GPT-3."},{"role":"user","content":[{"type":"text","text":"What model are you?"},{"type":"image_url","image_ur... [truncated]],
modelOptions.max_tokens: 4000,
}
2024-10-02T00:31:23.569Z debug: Making request to https://openrouter.ai/api/v1/chat/completions
2024-10-02T00:31:23.572Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2024-10-02T00:31:28.645Z debug: [OpenAIClient] chatCompletion response
{
provider: "OpenAI",
object: "chat.completion",
usage.prompt_tokens: 804,
usage.completion_tokens: 63,
usage.total_tokens: 867,
id: "gen-1727829085-DKCd950OIfY1r45jvuVA",
// 1 choice(s)
choices: [{"message":{"role":"assistant","content":"I cannot determine the exact breed or model of the cat in ... [truncated]],
created: 1727829085,
model: "openai/chatgpt-4o-latest",
system_fingerprint: "fp_bd428e1c3b",
}
2024-10-02T00:31:28.647Z debug: [spendTokens] conversationId: df763a09-eda1-4d89-912f-938862891369 | Context: message | Token usage:
{
promptTokens: 804,
completionTokens: 63,
}
2024-10-02T00:31:28.650Z debug: [spendTokens] No transactions incurred against balance
2024-10-02T00:31:28.652Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2024-10-02T00:31:28.656Z debug: [AskController] Request closed
2024-10-02T00:31:55.084Z debug: [AskController]
{
text: "Alright, what model are you again?",
conversationId: "df763a09-eda1-4d89-912f-938862891369",
endpoint: "OpenRouter",
endpointType: "custom",
resendFiles: true,
modelOptions.model: "openai/gpt-3.5-turbo",
modelsConfig: "exists",
}
2024-10-02T00:31:55.085Z debug: [BaseClient] Loading history:
{
conversationId: "df763a09-eda1-4d89-912f-938862891369",
parentMessageId: "a454b35c-bd72-4547-8425-b92fae0d7b94",
}
2024-10-02T00:31:55.088Z debug: [BaseClient] Context Count (1/2)
{
remainingContextTokens: 15506,
maxContextTokens: 16385,
}
2024-10-02T00:31:55.088Z debug: [BaseClient] Context Count (2/2)
{
remainingContextTokens: 15506,
maxContextTokens: 16385,
}
2024-10-02T00:31:55.089Z debug: [BaseClient] tokenCountMap:
{
3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: 12,
f87e5ba3-e155-48ab-b81b-ff5128ee3e76: 15,
891d2dc4-8e16-42f6-86ae-88aa19b097a3: 774,
a454b35c-bd72-4547-8425-b92fae0d7b94: 63,
6e008f96-7b44-4770-834a-a0a385c2e31b: 12,
}
2024-10-02T00:31:55.089Z debug: [BaseClient]
{
promptTokens: 879,
remainingContextTokens: 15506,
payloadSize: 5,
maxContextTokens: 16385,
}
2024-10-02T00:31:55.089Z debug: [BaseClient] tokenCountMap
{
3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: 12,
f87e5ba3-e155-48ab-b81b-ff5128ee3e76: 15,
891d2dc4-8e16-42f6-86ae-88aa19b097a3: 774,
a454b35c-bd72-4547-8425-b92fae0d7b94: 63,
6e008f96-7b44-4770-834a-a0a385c2e31b: 12,
instructions: undefined,
}
2024-10-02T00:31:55.089Z debug: [BaseClient] userMessage
{
messageId: "6e008f96-7b44-4770-834a-a0a385c2e31b",
parentMessageId: "a454b35c-bd72-4547-8425-b92fae0d7b94",
conversationId: "df763a09-eda1-4d89-912f-938862891369",
sender: "User",
text: "Alright, what model are you again?",
isCreatedByUser: true,
tokenCount: 12,
}
2024-10-02T00:31:55.089Z debug: [BaseClient] Skipping 3692e51a-17b5-42cc-bb1b-e8ea0cbadb24: already had a token count.
2024-10-02T00:31:55.089Z debug: [BaseClient] Skipping f87e5ba3-e155-48ab-b81b-ff5128ee3e76: already had a token count.
2024-10-02T00:31:55.089Z debug: [BaseClient] Skipping 891d2dc4-8e16-42f6-86ae-88aa19b097a3: already had a token count.
2024-10-02T00:31:55.089Z debug: [BaseClient] Skipping a454b35c-bd72-4547-8425-b92fae0d7b94: already had a token count.
2024-10-02T00:31:55.091Z debug: [OpenAIClient] chatCompletion
{
baseURL: "https://openrouter.ai/api/v1",
modelOptions.model: "openai/chatgpt-4o-latest",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.stream: true,
// 5 message(s)
modelOptions.messages: [{"role":"user","content":"What model are you?"},{"role":"assistant","content":"I am a language model trained by OpenAI called GPT-3."},{"role":"user","content":[{"type":"text","text":"What model are you?"},{"type":"image_url","image_ur... [truncated],{"role":"assistant","content":"I cannot determine the exact breed or model of the cat in this image,... [truncated],{"role":"user","content":"Alright, what model are you again?"}],
}
2024-10-02T00:31:55.094Z debug: [OpenAIClient] chatCompletion: dropped params
{
// 1 dropParam(s)
dropParams: ["stop"],
modelOptions.model: "openai/chatgpt-4o-latest",
modelOptions.user: "66fa1b65c36b8be1d7f719f6",
modelOptions.stream: true,
// 5 message(s)
modelOptions.messages: [{"role":"user","content":"What model are you?"},{"role":"assistant","content":"I am a language model trained by OpenAI called GPT-3."},{"role":"user","content":[{"type":"text","text":"What model are you?"},{"type":"image_url","image_ur... [truncated],{"role":"assistant","content":"I cannot determine the exact breed or model of the cat in this image,... [truncated],{"role":"user","content":"Alright, what model are you again?"}],
modelOptions.max_tokens: 4000,
}
2024-10-02T00:31:55.099Z debug: Making request to https://openrouter.ai/api/v1/chat/completions
2024-10-02T00:31:55.102Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2024-10-02T00:31:58.409Z debug: [OpenAIClient] chatCompletion response
{
provider: "OpenAI",
object: "chat.completion",
usage.prompt_tokens: 883,
usage.completion_tokens: 54,
usage.total_tokens: 937,
id: "gen-1727829116-BbUMEkkDChyKQDDpQCJf",
// 1 choice(s)
choices: [{"message":{"role":"assistant","content":"I am GPT-4, an advanced language model developed by OpenAI... [truncated]],
created: 1727829116,
model: "openai/chatgpt-4o-latest",
system_fingerprint: "fp_bd428e1c3b",
}
2024-10-02T00:31:58.411Z debug: [spendTokens] conversationId: df763a09-eda1-4d89-912f-938862891369 | Context: message | Token usage:
{
promptTokens: 879,
completionTokens: 54,
}
2024-10-02T00:31:58.413Z debug: [spendTokens] No transactions incurred against balance
2024-10-02T00:31:58.414Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2024-10-02T00:31:58.417Z debug: [AskController] Request closed
Screenshots
Code of Conduct
- I agree to follow this project's Code of Conduct
I do want to add, even though it is intentional for image prompts to fallback on a vision model -
I think having a vision fallback for non-multimodal models is a bit misleading. I personally think it's better to disable image uploads for these sorts of models than to secretly switch the model if the user uploads an image.
Or better yet, make it an option so users can choose to have a fallback. Maybe even possibly choose what vision model to fallback on?
They remain as gpt-4o because the default behavior is also to resend all previous image files, so once image files are attached to a conversation, it becomes a multi-modal conversation throughout.
I do plan to take several steps to make this less confusing and as you said, not do fallbacks unless explicitly set this way.
On top of this, making it more obvious what attachments are part of the active conversation.
Closing in favor of #1634