promptfoo / promptfoo

Test your prompts, agents, and RAGs. Redteaming, pentesting, vulnerability scanning for LLMs. Improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

Home Page:https://www.promptfoo.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python assertions not working with external.py

romaintoub opened this issue · comments

Hello!
Very sorry to hammer you with these issues, the config with vertex is not really working because of... VertexAI so I'd rather use my own custom grader for now.

Initially I was using llm-rubric with a custom prompt which worked pretty well but I want to reuse this prompt with a python assertion now, something like a defaultTest I can define as a provider such as:

main(args):
    llm = VertexAI()
    prompt = "you are an evaluator..."
    chain = prompt | llm
    res = chain.invoke({"question": args[0], "target": args[1]})
    # res should be something like: {"pass": True, "score": 0.9, "reason": "blabla"}
    processed_res = process(res) # make sure to return a dictionary with pass, score, and reason
    return processed_res

I tried the function you provided in the docs, eg.

import json
import sys

def main():
    if len(sys.argv) >= 3:
        output = sys.argv[1]
        context = json.loads(sys.argv[2])
    else:
        raise ValueError("Model output and context are expected from promptfoo.")
    success = {'pass': True, 'score': 1.0, 'reason': 'this is btf'}
    return success

print(main())

and got this error:
image

do you know where this comes from? is there a workaround here to test the first function?

No need to apologize, sorry you've had trouble getting things up and running.

For the custom python assert, try wrapping it in print(json.dumps(main())). It's on my todo list to convert this to a nicer API, such as implementing a function call_assert that can return a native python object.

Side note, is 0.49.2 still causing problems for you with Vertex AI?

yes I am still getting the same weird results from llm-rubric but I suspect VertexAI models and more specifically gemini-pro to be the reason why I have issues. This is why I'd like to bring my own model and run the pre-&post processing to make sure I have what I want

thanks for the feedback, will try the json.dumps()

@typpo I'm facing the same issue, when I'm using custom Python Assertion for my test cases as described in the docs - https://www.promptfoo.dev/docs/configuration/expected-outputs/python

I have confirmed using debug statement that json formed is correct, however I still get the same error as @romaintoub.
I am on the latest version of promptfoo i.e. v0.50.0 and am using json.dumps() to dump the dictionary.

Attaching the screenshot.

Screenshot 2024-04-02 at 1 12 29 PM

@reallyinvincible Hard to tell without more debug info. It looks like what's happening it that your assertion is outputting something other than JSON. Perhaps there are some other print statements/stdout elsewhere in the code?

Two next steps for me:

  1. This PR will improve debugging issues like this in the future #620
  2. I am going to switch python assertions to a native integration (similar to python providers) so that you can just return the value and not worrying about print statements and stdout...

@typpo That really sounds like a more elegant solution. Is it possible to get an ETA on the next release?

@reallyinvincible PR here #638, so likely today or tomorrow

Change is released in 0.51.0