"[FAIL] Error: Invalid provider definition for output type 'text'" when using model-graded metrics

Question

"[FAIL] Error: Invalid provider definition for output type 'text'" when using model-graded metrics

jkng10 opened this issue 3 months ago · comments

jkng10 commented 3 months ago

Hey all,

I'm using promptfoo 0.50.1 and would like to use model-graded metrics, wondering if something is wrong with my config.

Referring to #323 (comment) - I am using a text-embedding-ada-002 deployment, deployed via Azure OpenAI.

Below is my config

...
- type: llm-rubric
      value: is not apologetic
      provider:
        - id: azureopenai:embeddings:text-embedding-ada-002
          config:
            temperature: 0
            max_tokens: 128
            ...
...

When running promptfoo eval against the above, I get a

Ian Webster · Answer 1 · Thu Apr 04 2024 17:33:11 GMT+0800 (China Standard Time)

It should be a single object, not an array:

- type: llm-rubric
      value: is not apologetic
      provider:
        id: azureopenai:embeddings:text-embedding-ada-002
        config:
          temperature: 0
          max_tokens: 128
          #...

jkng10 · Answer 2 · Thu Apr 04 2024 17:49:47 GMT+0800 (China Standard Time)

Thanks @typpo, appreciate the quick feedback, this error is gone! However now getting the below:

Error: Not implemented
    at AzureOpenAiEmbeddingProvider.callApi (...\npm\node_modules\promptfoo\dist\src\providers\azureopenai.js:76:15)
    at matchesLlmRubric (...\npm\node_modules\promptfoo\dist\src\matchers.js:227:38)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async runAssertion (...\npm\node_modules\promptfoo\dist\src\assertions.js:667:17)
    at async ...\npm\node_modules\promptfoo\dist\src\assertions.js:82:24

Is there any difference in between using

id: azureopenai:embedding:text-embedding-ada-002
as per https://www.promptfoo.dev/docs/configuration/expected-outputs/similar/

or

id: azureopenai:embeddings:text-embeddings-ada-002
as per #323 (comment)

Ian Webster · Answer 3 · Thu Apr 04 2024 18:13:08 GMT+0800 (China Standard Time)

Ah, I didn't notice you were using llm-rubric, but that requires a text generation provider (like gpt-4), not an embedding provider. More details here

jkng10 · Answer 4 · Thu Apr 04 2024 23:00:15 GMT+0800 (China Standard Time)

Thanks @typpo ! It works now!

One last quick question - can provider details/config be also defined more globally on "assert" level?

I'd like to use the gpt-4 provider for all subsequent llm-rubric and context-* types and currently specify same provider details/config (e.g. apiKey and apiHost) again for each "type", which looks redundant.

currently:

assert:
  - type: llm-rubric
    value: does not describe self as an AI or chat assistant
    threshold: 0.85
    provider:
      id: azureopenai:chat:gpt-4
      config:
        apiKey: <key>
        apiHost: <url>
  - type: llm-rubric
    value: is not apologetic
    threshold: 0.85
    provider:
      id: azureopenai:chat:gpt-4
      config:
        apiKey: <key>
        apiHost: <url>
  - type: llm-rubric
    value: is confident and does not guess or speculate
    threshold: 0.85
    provider:
      id: azureopenai:chat:gpt-4
      config:
        apiKey: <key>
        apiHost: <url>

Ian Webster · Answer 5 · Fri Apr 05 2024 02:43:31 GMT+0800 (China Standard Time)

Yep, you can put it on the defaultTest object. See here: https://promptfoo.dev/docs/configuration/expected-outputs/model-graded/#multiple-graders

jkng10 · Answer 6 · Fri Apr 05 2024 14:46:55 GMT+0800 (China Standard Time)

Works like a charm, thanks @typpo!