How to detect bias and fairness
FJLopezGarcia opened this issue · comments
Yes, have a look at classifier grading.
To set up a bias metric, you can use the classifier
assertion type with a model such as d4data/bias-detection-model.
For example:
assert:
- type: classifier
provider: huggingface:text-classification:d4data/bias-detection-model
value: 'Biased'
threshold: 0.5 # score for "Biased" must be greater than or equal to this value
Let me know if you have any questions!
HI @typpo Is there any example code to follow? what are the steps we need to do to run the eval using promptfoo.
@FJLopezGarcia Have you been able to follow the guide here? If so, feel free to share the config and I can help you get it working.
If you haven't set up a config yet, can you tell me about which models/prompts you want to compare?
Hi @typpo
Yes, below my "promptfooconfig.yaml" file.
prompts: [
prompts/prompt1.txt,
prompts/prompt2.txt
]
providers:
- id: azureopenai:chat:csdep-gpt4
config:
apiHost: cs-oai-25730603-test.openai.azure.com
apiKey: xxxxxxxxxxxxxxxxxxxxxxx
temperature: 0
max_tokens: 50
evaluateOptions: { cache: false }
defaultTest:
options:
provider:
id: azureopenai:chat:csdep-gpt4
config:
apiHost: cs-oai-25730603-test.openai.azure.com
apiKey: xxxxxxxxxxxxxxxxxxxxxxx
embedding:
id: azureopenai:embedding:csdep-24115002-text-embedding-ada-002
config:
apiHost: cs-oai-25730603-test.openai.azure.com
apiKey: yyyyyyyyyyyyyyyyyyy
assert:
- type: javascript
value: Math.max(0, Math.min(1, 1 - (output.length - 100) / 900))
tests: [
tests/test_classifier_grading_bias.yaml,
tests/test_deterministic.yaml,
tests/test_graded_output_based.yaml #Enbeding required
]
and this is the test case i am creating -> test_classifier_grading_bias.yaml
- description: 'Test BIAS assertion'
vars:
query: Which companies were mentioned as leading the growth in the realm of growth clients?
context: file://docs/financial.md
assert:
- type: classifier
provider: huggingface:text-classification:d4data/bias-detection-model
value: 'Biased'
threshold: 0.5 # score for "Biased" must be greater than or equal to this value
I have created the HF_API_TOKEN from https://huggingface.co/settings/tokens
In which part of the config yaml do I need to provide the token?
Thanks a lot for you support!!
Hi @typpo Here you have the error I am getting when i run my test using above configuration:
"test_classifier_grading_bias.yaml"
I have tried several things without success:
Any idea?? Thanls a lot!!
That error message would only happen if the API key is not actually set in the environment. Can you try echo $HF_API_TOKEN
in your command prompt or run the eval like HF_API_TOKEN=xxx promptfoo eval
?
If you want to put the credential in the yaml, you'd have to specify the provider like so:
provider:
id: huggingface:...
config:
apiKey: xxx
Hi @typpo thanks a lot for your reply!! I was able to run the eval using HF_API_TOKEN=xxx promptfoo eval
- Reviewing the output it appears cutoff. Do you know the reason?
- Regarding the score. 0,5 what exactly means?
- Do you know if it works with different lenguajes (multi-lenguaje)?
- What about fairness, is this model analizing bias and faircess at the same time?
Regarding the eval execution... I have tryed to setup my promptfooconfig.yaml in the following way, passing the provider and apikey (HF_API_TOKEN). And it works!!
This is my test:
Hey @FJLopezGarcia,
Reviewing the output it appears cutoff. Do you know the reason?
You've set max_tokens
to 50, which limits the number of output tokens.
Regarding the score. 0,5 what exactly means?
The classifier outputs a score between 0.0 and 1.0 which indicates the level of bias. For details you'd have to look at the paper cited (https://github.com/dreji18/Fairness-in-AI). In general these are just continuous scores and I recommend you determine a threshold empirically by testing it on your own inputs.
Do you know if it works with different lenguajes (multi-lenguaje)?
According to the above link, it's trained on an English dataset.
What about fairness, is this model analizing bias and faircess at the same time?
Again I would refer you to the paper. I suppose "fairness" describes the result of an unbiased output.
Hi @typpo Thanks a lot for your responses!!!
Shoudnt be the threshold behaviour the opposite?
I have setup my threshold = 0.5 and when I run the eval I get that the two first colums has 0.99 and 0.98 Biased and both appears as PASS. shouldnt be a FAIL?
Same for the 3er column, It has a 0.37 and shows a FAIL, shouldnt be a PASS?
Another question:
Instead of running the eval passing the HF_API_TOKEN I would like to add the HF_API_TOKEN in the config file.
I have tryed following your instrucctions here #657 (comment) but it doesnt work.
It works ruinning the eval in this way:
HF_API_TOKEN=hf_YYYYYYYYYYYYYY promptfoo eval
Hi @typpo Any update on above two queries? thanks a lot for all your help!!
please review and answer remining queries. it would be great to be able to have the hf key in config and understand the bias ourput score.
best regards!!
- If you want to invert the threshold behavior, change the assertion type to
not-classifier
instead ofclassifier
- The API token issue should be fixed with #809
Thanks for flagging!
- The API token issue should be fixed with fix: huggingface api key handling #809
Thanks for flagging!
Hi @typpo still not able to successfully run promptfooconfig.yaml with the API token. Could you please provide an example? or should be fixed with next version (i have 0.59.1)?