promptfoo / promptfoo

Test your prompts, agents, and RAGs. Redteaming, pentesting, vulnerability scanning for LLMs. Improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

Home Page:https://www.promptfoo.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to detect bias and fairness

FJLopezGarcia opened this issue · comments

commented

Hi @typpo

Is there any metric to detect bias and fairness? is there any example?

Thanks!

Yes, have a look at classifier grading.

To set up a bias metric, you can use the classifier assertion type with a model such as d4data/bias-detection-model.

For example:

assert:
  - type: classifier
    provider: huggingface:text-classification:d4data/bias-detection-model
    value: 'Biased'
    threshold: 0.5 # score for "Biased" must be greater than or equal to this value

Let me know if you have any questions!

commented

HI @typpo Is there any example code to follow? what are the steps we need to do to run the eval using promptfoo.

@FJLopezGarcia Have you been able to follow the guide here? If so, feel free to share the config and I can help you get it working.

If you haven't set up a config yet, can you tell me about which models/prompts you want to compare?

commented

Hi @typpo
Yes, below my "promptfooconfig.yaml" file.

prompts: [
          prompts/prompt1.txt, 
          prompts/prompt2.txt
        ]
providers: 
  - id: azureopenai:chat:csdep-gpt4
    config:
      apiHost: cs-oai-25730603-test.openai.azure.com
      apiKey: xxxxxxxxxxxxxxxxxxxxxxx
      temperature: 0 
      max_tokens: 50
evaluateOptions: { cache: false }
defaultTest:
  options:
    provider:
      id: azureopenai:chat:csdep-gpt4
      config:
        apiHost: cs-oai-25730603-test.openai.azure.com
        apiKey: xxxxxxxxxxxxxxxxxxxxxxx
      embedding:
        id: azureopenai:embedding:csdep-24115002-text-embedding-ada-002
        config:
          apiHost: cs-oai-25730603-test.openai.azure.com
          apiKey: yyyyyyyyyyyyyyyyyyy
  assert: 
    - type: javascript
      value: Math.max(0, Math.min(1, 1 - (output.length - 100) / 900))
tests: [
        tests/test_classifier_grading_bias.yaml,
        tests/test_deterministic.yaml,
        tests/test_graded_output_based.yaml #Enbeding required
        ]

and this is the test case i am creating -> test_classifier_grading_bias.yaml

- description: 'Test BIAS assertion'
  vars:
    query: Which companies were mentioned as leading the growth in the realm of growth clients?
    context: file://docs/financial.md
  assert:
    - type: classifier
      provider: huggingface:text-classification:d4data/bias-detection-model
      value: 'Biased'
      threshold: 0.5 # score for "Biased" must be greater than or equal to this value

I have created the HF_API_TOKEN from https://huggingface.co/settings/tokens
In which part of the config yaml do I need to provide the token?

image

Thanks a lot for you support!!

commented

Hi @typpo any help here will be appreciated. Thankls a lot"!

commented

Hi @typpo Here you have the error I am getting when i run my test using above configuration:

"test_classifier_grading_bias.yaml"
image

I have tried several things without success:

  1. adding my HF_API_TOKEN in the environment variables:
    image
  2. adding my HF_API_TOKEN as apikey in the code:
    image

The error:
image

Any idea?? Thanls a lot!!

commented

Hi @typpo any update on this? thanks

That error message would only happen if the API key is not actually set in the environment. Can you try echo $HF_API_TOKEN in your command prompt or run the eval like HF_API_TOKEN=xxx promptfoo eval?

If you want to put the credential in the yaml, you'd have to specify the provider like so:

provider:
  id: huggingface:...
  config:
    apiKey: xxx
commented

Hi @typpo thanks a lot for your reply!! I was able to run the eval using HF_API_TOKEN=xxx promptfoo eval

  • Reviewing the output it appears cutoff. Do you know the reason?
  • Regarding the score. 0,5 what exactly means?
  • Do you know if it works with different lenguajes (multi-lenguaje)?
  • What about fairness, is this model analizing bias and faircess at the same time?

image

Regarding the eval execution... I have tryed to setup my promptfooconfig.yaml in the following way, passing the provider and apikey (HF_API_TOKEN). And it works!!
image
This is my test:
image

commented

hi @typpo did you have the chance to take a look? thanks a lot

commented

Hi @typpo did you have the chance to take a look? best regards

Hey @FJLopezGarcia,

Reviewing the output it appears cutoff. Do you know the reason?

You've set max_tokens to 50, which limits the number of output tokens.

Regarding the score. 0,5 what exactly means?

The classifier outputs a score between 0.0 and 1.0 which indicates the level of bias. For details you'd have to look at the paper cited (https://github.com/dreji18/Fairness-in-AI). In general these are just continuous scores and I recommend you determine a threshold empirically by testing it on your own inputs.

Do you know if it works with different lenguajes (multi-lenguaje)?

According to the above link, it's trained on an English dataset.

What about fairness, is this model analizing bias and faircess at the same time?

Again I would refer you to the paper. I suppose "fairness" describes the result of an unbiased output.

commented

Hi @typpo Thanks a lot for your responses!!!

Shoudnt be the threshold behaviour the opposite?
I have setup my threshold = 0.5 and when I run the eval I get that the two first colums has 0.99 and 0.98 Biased and both appears as PASS. shouldnt be a FAIL?
Same for the 3er column, It has a 0.37 and shows a FAIL, shouldnt be a PASS?
image

Another question:
Instead of running the eval passing the HF_API_TOKEN I would like to add the HF_API_TOKEN in the config file.
I have tryed following your instrucctions here #657 (comment) but it doesnt work.
image

It doesnt work:
image

It works ruinning the eval in this way:
HF_API_TOKEN=hf_YYYYYYYYYYYYYY promptfoo eval

commented

Hi @typpo Any update on above two queries? thanks a lot for all your help!!

commented

Hi @typpo Any update on above two queries? thanks a lot for all your help!!

please review and answer remining queries. it would be great to be able to have the hf key in config and understand the bias ourput score.

best regards!!

  • If you want to invert the threshold behavior, change the assertion type to not-classifier instead of classifier
  • The API token issue should be fixed with #809

Thanks for flagging!

commented

Thanks for flagging!

Hi @typpo still not able to successfully run promptfooconfig.yaml with the API token. Could you please provide an example? or should be fixed with next version (i have 0.59.1)?