Cannot specify role: system in LLM::Anthropic

Question

Cannot specify role: system in LLM::Anthropic

kokuyouwind opened this issue 3 months ago · comments

Description

In OpenAI and Ollama, it is possible to specify system, user, and assistant as roles.
But in Anthropic, only user and assistant can be specified, and an error will occur if system is specified.

> Langchain::LLM::Anthropic.new.chat(messages: [{ role: 'system', content: 'Act as a professional programmer.'}, { role: 'user', content: 'What is LLM?' }])
#<Langchain::LLM::AnthropicResponse:0x00000001295e3768
 @model=nil,
 @raw_response=
  {"type"=>"error",
   "error"=>
    {"type"=>"invalid_request_error",
     "message"=>"messages: Unexpected role \"system\". The Messages API accepts a top-level `system` parameter, not \"system\" as an input message role."}}>

I know that the Anthropic API specification does not allow a system to be specified for a role, and instead requires the use of a top-level system parameter.
However, as an LLM framework, I believe it is preferable to be able to absorb the differences between each service and use a common interface.

Reference case: Python Library

In the python library, ChatAnthropic accepts ChatPromptTemplate.from_messages([("system", system), ("human", human)]).
https://python.langchain.com/docs/integrations/chat/anthropic/

Proposal

As a preprocessing step in LLM::Anthropic#chat, how about taking only the role: system messages in messages in addition to the top-level argument system, and putting them all joined on a line break into the system of the API?
This would not break the existing behavior and could be used for cases with multiple role: system messsages.

Andrei Bondarev · Answer 1 · Thu May 02 2024 22:55:20 GMT+0800 (China Standard Time)

@kokuyouwind Thank you for this proposal! I'm curious do you have a need for this in your applications? Would this make passing the system role easier for you?

kokuyouwind · Answer 2 · Fri May 03 2024 09:33:05 GMT+0800 (China Standard Time)

@andreibondarev

Thanks for the reply.

I'm curious do you have a need for this in your applications? Would this make passing the system role easier for you?

I am building a development tool that will allow users to freely configure their preferred LLM.
The tool generates messages containing role: system and calls chat, so Anthropic is not available.

ref. kokuyouwind/rbs_goose@6609103#diff-439afd54f569f0ca0e76999e2d9819d89e82376de99b31652b1aafc81ec64144R91-R93

To avoid this, one of the following actions is needed.

Change the method of generating messages by determining if it is Anthropic or not.
Give up using system role messages and include all of them in user role messages.
Modify LLM::Anthropic to accept role: system messages (this proposal)

If this proposal is accepted, the tool does not need to care about the LLM type and can be implemented simply.
I also think it is a good idea to make the interface compatible so that Agents, etc. can be used with any LLM in the future.

Andrei Bondarev · Answer 3 · Wed May 08 2024 06:40:10 GMT+0800 (China Standard Time)

@kokuyouwind Thank you for your PR and using this library in your gem 😄

I'd like to actually think through this after the Langchain::Assistant Anthropic support is added here: #543.

We would need to add an AnthropicMessage class like this one: https://github.com/patterns-ai-core/langchainrb/pull/513/files#diff-86baf19d3db04ca4b773792c27230e17bb4ba4f9373d17688b8a2f67de6f9c28

kokuyouwind · Answer 4 · Wed May 08 2024 09:17:44 GMT+0800 (China Standard Time)

@andreibondarev
Ok, it is true that a layer of Messages class seems to be better than a layer of raw data to absorb incompatibilities between LLMs.

Personally, I would be happy if it could be used in cases where Assistant is not used (cases where LLM::XXXClient#chat is used directly, or where a chain of RAGs, QA bots, etc. is set up).
For this purpose, it would be necessary to create a new layer such as LLM::Messages or Prompt::ChatPromptTemplate instead of Assistants::Messages, so that it can be passed to LLM::Client#chat. This would be a major modification and may be difficult to undertake immediately.

kokuyouwind · Answer 5 · Wed May 08 2024 09:42:23 GMT+0800 (China Standard Time)

You may close this Issue and pull request #604, as I have resolved all of my original issues by bringing them all to the user message.

If you would consider it, can we start another Issue as “Separating implementations not related to tools from Assistant”?
As you say, Assistant is able to absorb the differences between LLMs, but it seems to include implementations that are not directly related to tools.
By separating these from Assistant, I think we can absorb incompatibilities in the way system prompts are passed even in cases where tools are not used, and we can also handle cases where we want to use Threads directly, as mentioned in #608.

Andrei Bondarev · Answer 6 · Tue May 21 2024 00:10:48 GMT+0800 (China Standard Time)

@kokuyouwind I've been thinking that the #chat(messages: []) method could accept the Langchain::Messages::* instances directly. For example:

message_1 = Langchain::Messages::AnthropicMessage.new(role:"user", content:"hi!")
message_2 = Langchain::Messages::AnthropicMessage.new(role:"assistant", content:"Hey! How can I help?")
message_3 = Langchain::Messages::AnthropicMessage.new(role:"assistant", content:"Help me debug my computer")

Langchain::LLM::Anthropic.new(...).chat(messages: [message_1, message_2, message_3])

kokuyouwind · Answer 7 · Tue May 21 2024 07:25:59 GMT+0800 (China Standard Time)

@andreibondarev
I think it is a very excellent idea.
To make it more generic, I think it would be better to make it not an AnthropicMessage but a per-role message instance that is LLM-independent.

message_1 = Langchain::Messages::UserMessage.new("hi!")
message_2 = Langchain::Messages::AssistantMessage.new("Hey! How can I help?")
message_3 = Langchain::Messages::AssistantMessage.new("Help me debug my computer")

Langchain::LLM::Anthropic.new(...).chat(messages: [message_1, message_2, message_3])

The class names above are aligned with the role notation, but could be aligned with Python's LangChain Messages, such as HumanMessage, AIMessage, etc.

kokuyouwind · Answer 8 · Tue May 21 2024 07:28:33 GMT+0800 (China Standard Time)

I did not notice that a discussion forum was created in #629.
I will close this issue so that we can discuss it there.