huggingface / chat-ui

Open source codebase powering the HuggingChat app

Home Page:https://huggingface.co/chat

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Assistants do not work on Self-Hosted install w/Claude or Cohere endpoints

gururise opened this issue · comments

** Seems like any endpoint type that is not openai does not work with assistants **

Using any model on the Claude or Cohere endpoints on a SELF HOSTED install, the assistant instructions are basically ignored. I have tried adding a chatml template to the chatPromptTemplate to the .env.local, but it made no difference:

DOES NOT WORK:

  {
    "name": "CohereForAI/c4ai-command-r-plus",
    "id": "command-r-plus",
    "description" : "Command R+ is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It is best suited for complex RAG workflows and multi-step tool use.",
    "preprompt": "",
    "chatPromptTemplate" : "{{#if @root.preprompt}}<|im_start|>system\n{{@root.preprompt}}<|im_end|>\n{{/if}}{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n{{/ifUser}}{{#ifAssistant}}{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.4,
      "top_p": 0.9,
      "top_k": 50,
      "truncate": 65000,
      "max_new_tokens": 3128,
    },
    "endpoints" : [{
      "type": "cohere"
    }],
  },

DOES NOT WORK:

  {
      "name": "claude-3-sonnet-20240229",
      "displayName": "Claude 3 Sonnet",
      "description": "Ideal balance of intelligence and speed",
      "preprompt": "",
      "chatPromptTemplate" : "{{#if @root.preprompt}}<|im_start|>system\n{{@root.preprompt}}<|im_end|>\n{{/if}}{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n{{/ifUser}}{{#ifAssistant}}{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}",
      "parameters": {
        "temperature": 0.4,
        "top_p": 0.95,
        "top_k": 50,
        "max_new_tokens": 3128,
        "truncate": 65128,
      },
      "endpoints": [
        {
          "type": "anthropic",
          // optionals
          "baseURL": "https://api.anthropic.com",
          defaultHeaders: {},
          defaultQuery: {}
        }
      ]
  },

WORKS:

  {
    "name": "NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO",
    "displayName": "Nous-Hermes-2-Mixtral-DPO",
    "description" : "Nous Hermes 2 Mixtral 8x7B DPO is the new flagship Nous Research model trained over the Mixtral 8x7B MoE LLM.",
    "preprompt": "",
    "chatPromptTemplate" : "{{#if @root.preprompt}}<|im_start|>system\n{{@root.preprompt}}<|im_end|>\n{{/if}}{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n{{/ifUser}}{{#ifAssistant}}{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.4,
      "top_p": 0.7,
      "top_k": 50,
      "truncate": 30720,
      "max_new_tokens": 2048,
      "stop": ["<|im_end|>","<|im_start|>"]
    },
    "endpoints" : [{
      "type": "openai",
      "baseURL": "https://api.together.xyz/v1",
    }],
  },

The assistants work fine on the openai endpoint (tested with gpt-3.5-turbo and Together.ai)

Thanks for the report! Could it be that the instructions are not passed correctly? Do you have similar issues if you pass a system prompt directly? (no assistant, just edit the system prompt in settings)

System prompt: Act like a pirate. Everything you say will be piratey.

Screenshot_20240412_055551_Kiwi Browser

Seems the custom system prompt is being passed to claude.

However if i create a pirate assistant with the same system prompt it does not work:

Screenshot_20240412_055931_Kiwi Browser
Screenshot_20240412_060315_Kiwi Browser

A few things i noticed:

  • the assistant would give me an error about an invalid int for temp and top P & K, when trying to respond, unless i manually added a temperature, top P and K in the assistant form.
  • even after adding temp, P, K an error message would flash quickly on every response. It was too fast for me to capture. But the model still responds, just not using the assistant instructions.

I think I made a mistake somewhere, there's two ways of passing system prompts, some endpoints require you to pass a system prompt with a specific argument and others want the first message to have the system prompt like this {'role':"system","content":"talk like a pirate"}

Must have messed up somewhere 😅 will take a look

Closing with #1023