promptfoo / promptfoo

Test your prompts, agents, and RAGs. Use LLM evals to improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

Home Page:https://www.promptfoo.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"[FAIL] Error: Invalid provider definition for output type 'text'" when using model-graded metrics

jkng10 opened this issue · comments

Hey all,

I'm using promptfoo 0.50.1 and would like to use model-graded metrics, wondering if something is wrong with my config.

Referring to #323 (comment) - I am using a text-embedding-ada-002 deployment, deployed via Azure OpenAI.

Below is my config

...
- type: llm-rubric
      value: is not apologetic
      provider:
        - id: azureopenai:embeddings:text-embedding-ada-002
          config:
            temperature: 0
            max_tokens: 128
            ...
...

When running promptfoo eval against the above, I get a

image

It should be a single object, not an array:

- type: llm-rubric
      value: is not apologetic
      provider:
        id: azureopenai:embeddings:text-embedding-ada-002
        config:
          temperature: 0
          max_tokens: 128
          #...

Thanks @typpo, appreciate the quick feedback, this error is gone! However now getting the below:

Error: Not implemented
    at AzureOpenAiEmbeddingProvider.callApi (...\npm\node_modules\promptfoo\dist\src\providers\azureopenai.js:76:15)
    at matchesLlmRubric (...\npm\node_modules\promptfoo\dist\src\matchers.js:227:38)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async runAssertion (...\npm\node_modules\promptfoo\dist\src\assertions.js:667:17)
    at async ...\npm\node_modules\promptfoo\dist\src\assertions.js:82:24

Is there any difference in between using

id: azureopenai:embedding:text-embedding-ada-002
as per https://www.promptfoo.dev/docs/configuration/expected-outputs/similar/

or

id: azureopenai:embeddings:text-embeddings-ada-002
as per #323 (comment)

Ah, I didn't notice you were using llm-rubric, but that requires a text generation provider (like gpt-4), not an embedding provider. More details here

Thanks @typpo ! It works now!

One last quick question - can provider details/config be also defined more globally on "assert" level?

I'd like to use the gpt-4 provider for all subsequent llm-rubric and context-* types and currently specify same provider details/config (e.g. apiKey and apiHost) again for each "type", which looks redundant.

currently:

assert:
  - type: llm-rubric
    value: does not describe self as an AI or chat assistant
    threshold: 0.85
    provider:
      id: azureopenai:chat:gpt-4
      config:
        apiKey: <key>
        apiHost: <url>
  - type: llm-rubric
    value: is not apologetic
    threshold: 0.85
    provider:
      id: azureopenai:chat:gpt-4
      config:
        apiKey: <key>
        apiHost: <url>
  - type: llm-rubric
    value: is confident and does not guess or speculate
    threshold: 0.85
    provider:
      id: azureopenai:chat:gpt-4
      config:
        apiKey: <key>
        apiHost: <url>

Works like a charm, thanks @typpo!