patterns-ai-core / langchainrb

Build LLM-powered applications in Ruby

Home Page:https://rubydoc.info/gems/langchainrb

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot specify role: system in LLM::Anthropic

kokuyouwind opened this issue · comments

Description

In OpenAI and Ollama, it is possible to specify system, user, and assistant as roles.
But in Anthropic, only user and assistant can be specified, and an error will occur if system is specified.

> Langchain::LLM::Anthropic.new.chat(messages: [{ role: 'system', content: 'Act as a professional programmer.'}, { role: 'user', content: 'What is LLM?' }])
#<Langchain::LLM::AnthropicResponse:0x00000001295e3768
 @model=nil,
 @raw_response=
  {"type"=>"error",
   "error"=>
    {"type"=>"invalid_request_error",
     "message"=>"messages: Unexpected role \"system\". The Messages API accepts a top-level `system` parameter, not \"system\" as an input message role."}}>

I know that the Anthropic API specification does not allow a system to be specified for a role, and instead requires the use of a top-level system parameter.
However, as an LLM framework, I believe it is preferable to be able to absorb the differences between each service and use a common interface.

Reference case: Python Library

In the python library, ChatAnthropic accepts ChatPromptTemplate.from_messages([("system", system), ("human", human)]).
https://python.langchain.com/docs/integrations/chat/anthropic/

Proposal

As a preprocessing step in LLM::Anthropic#chat, how about taking only the role: system messages in messages in addition to the top-level argument system, and putting them all joined on a line break into the system of the API?
This would not break the existing behavior and could be used for cases with multiple role: system messsages.

@kokuyouwind Thank you for this proposal! I'm curious do you have a need for this in your applications? Would this make passing the system role easier for you?

@andreibondarev

Thanks for the reply.

I'm curious do you have a need for this in your applications? Would this make passing the system role easier for you?

I am building a development tool that will allow users to freely configure their preferred LLM.
The tool generates messages containing role: system and calls chat, so Anthropic is not available.

ref. kokuyouwind/rbs_goose@6609103#diff-439afd54f569f0ca0e76999e2d9819d89e82376de99b31652b1aafc81ec64144R91-R93

To avoid this, one of the following actions is needed.

  • Change the method of generating messages by determining if it is Anthropic or not.
  • Give up using system role messages and include all of them in user role messages.
  • Modify LLM::Anthropic to accept role: system messages (this proposal)

If this proposal is accepted, the tool does not need to care about the LLM type and can be implemented simply.
I also think it is a good idea to make the interface compatible so that Agents, etc. can be used with any LLM in the future.

@kokuyouwind Thank you for your PR and using this library in your gem 😄

I'd like to actually think through this after the Langchain::Assistant Anthropic support is added here: #543.

We would need to add an AnthropicMessage class like this one: https://github.com/patterns-ai-core/langchainrb/pull/513/files#diff-86baf19d3db04ca4b773792c27230e17bb4ba4f9373d17688b8a2f67de6f9c28

@andreibondarev
Ok, it is true that a layer of Messages class seems to be better than a layer of raw data to absorb incompatibilities between LLMs.

Personally, I would be happy if it could be used in cases where Assistant is not used (cases where LLM::XXXClient#chat is used directly, or where a chain of RAGs, QA bots, etc. is set up).
For this purpose, it would be necessary to create a new layer such as LLM::Messages or Prompt::ChatPromptTemplate instead of Assistants::Messages, so that it can be passed to LLM::Client#chat. This would be a major modification and may be difficult to undertake immediately.

You may close this Issue and pull request #604, as I have resolved all of my original issues by bringing them all to the user message.

If you would consider it, can we start another Issue as “Separating implementations not related to tools from Assistant”?
As you say, Assistant is able to absorb the differences between LLMs, but it seems to include implementations that are not directly related to tools.
By separating these from Assistant, I think we can absorb incompatibilities in the way system prompts are passed even in cases where tools are not used, and we can also handle cases where we want to use Threads directly, as mentioned in #608.

@kokuyouwind I've been thinking that the #chat(messages: []) method could accept the Langchain::Messages::* instances directly. For example:

message_1 = Langchain::Messages::AnthropicMessage.new(role:"user", content:"hi!")
message_2 = Langchain::Messages::AnthropicMessage.new(role:"assistant", content:"Hey! How can I help?")
message_3 = Langchain::Messages::AnthropicMessage.new(role:"assistant", content:"Help me debug my computer")

Langchain::LLM::Anthropic.new(...).chat(messages: [message_1, message_2, message_3])

@andreibondarev
I think it is a very excellent idea.
To make it more generic, I think it would be better to make it not an AnthropicMessage but a per-role message instance that is LLM-independent.

message_1 = Langchain::Messages::UserMessage.new("hi!")
message_2 = Langchain::Messages::AssistantMessage.new("Hey! How can I help?")
message_3 = Langchain::Messages::AssistantMessage.new("Help me debug my computer")

Langchain::LLM::Anthropic.new(...).chat(messages: [message_1, message_2, message_3])

The class names above are aligned with the role notation, but could be aligned with Python's LangChain Messages, such as HumanMessage, AIMessage, etc.

I did not notice that a discussion forum was created in #629.
I will close this issue so that we can discuss it there.