Assistants do not work on Self-Hosted install w/Claude or Cohere endpoints

Question

Assistants do not work on Self-Hosted install w/Claude or Cohere endpoints

gururise opened this issue 2 months ago · comments

** Seems like any endpoint type that is not openai does not work with assistants **

Using any model on the Claude or Cohere endpoints on a SELF HOSTED install, the assistant instructions are basically ignored. I have tried adding a chatml template to the chatPromptTemplate to the .env.local, but it made no difference:

DOES NOT WORK:

  {
    "name": "CohereForAI/c4ai-command-r-plus",
    "id": "command-r-plus",
    "description" : "Command R+ is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It is best suited for complex RAG workflows and multi-step tool use.",
    "preprompt": "",
    "chatPromptTemplate" : "{{#if @root.preprompt}}<|im_start|>system\n{{@root.preprompt}}<|im_end|>\n{{/if}}{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n{{/ifUser}}{{#ifAssistant}}{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.4,
      "top_p": 0.9,
      "top_k": 50,
      "truncate": 65000,
      "max_new_tokens": 3128,
    },
    "endpoints" : [{
      "type": "cohere"
    }],
  },

DOES NOT WORK:

  {
      "name": "claude-3-sonnet-20240229",
      "displayName": "Claude 3 Sonnet",
      "description": "Ideal balance of intelligence and speed",
      "preprompt": "",
      "chatPromptTemplate" : "{{#if @root.preprompt}}<|im_start|>system\n{{@root.preprompt}}<|im_end|>\n{{/if}}{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n{{/ifUser}}{{#ifAssistant}}{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}",
      "parameters": {
        "temperature": 0.4,
        "top_p": 0.95,
        "top_k": 50,
        "max_new_tokens": 3128,
        "truncate": 65128,
      },
      "endpoints": [
        {
          "type": "anthropic",
          // optionals
          "baseURL": "https://api.anthropic.com",
          defaultHeaders: {},
          defaultQuery: {}
        }
      ]
  },

WORKS:

  {
    "name": "NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO",
    "displayName": "Nous-Hermes-2-Mixtral-DPO",
    "description" : "Nous Hermes 2 Mixtral 8x7B DPO is the new flagship Nous Research model trained over the Mixtral 8x7B MoE LLM.",
    "preprompt": "",
    "chatPromptTemplate" : "{{#if @root.preprompt}}<|im_start|>system\n{{@root.preprompt}}<|im_end|>\n{{/if}}{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n{{/ifUser}}{{#ifAssistant}}{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.4,
      "top_p": 0.7,
      "top_k": 50,
      "truncate": 30720,
      "max_new_tokens": 2048,
      "stop": ["<|im_end|>","<|im_start|>"]
    },
    "endpoints" : [{
      "type": "openai",
      "baseURL": "https://api.together.xyz/v1",
    }],
  },

The assistants work fine on the openai endpoint (tested with gpt-3.5-turbo and Together.ai)

Nathan Sarrazin · Answer 1 · Fri Apr 12 2024 17:06:42 GMT+0800 (China Standard Time)

Thanks for the report! Could it be that the instructions are not passed correctly? Do you have similar issues if you pass a system prompt directly? (no assistant, just edit the system prompt in settings)

Gene Ruebsamen · Answer 2 · Fri Apr 12 2024 20:52:58 GMT+0800 (China Standard Time)

System prompt: Act like a pirate. Everything you say will be piratey.

Seems the custom system prompt is being passed to claude.

Gene Ruebsamen · Answer 3 · Fri Apr 12 2024 21:07:32 GMT+0800 (China Standard Time)

However if i create a pirate assistant with the same system prompt it does not work:

A few things i noticed:

the assistant would give me an error about an invalid int for temp and top P & K, when trying to respond, unless i manually added a temperature, top P and K in the assistant form.
even after adding temp, P, K an error message would flash quickly on every response. It was too fast for me to capture. But the model still responds, just not using the assistant instructions.

Nathan Sarrazin · Answer 4 · Fri Apr 12 2024 21:31:02 GMT+0800 (China Standard Time)

I think I made a mistake somewhere, there's two ways of passing system prompts, some endpoints require you to pass a system prompt with a specific argument and others want the first message to have the system prompt like this {'role':"system","content":"talk like a pirate"}

Must have messed up somewhere 😅 will take a look

Nathan Sarrazin · Answer 5 · Wed Apr 17 2024 22:53:46 GMT+0800 (China Standard Time)

Closing with #1023