TOOLS mode fails on retries
samgregson opened this issue · comments
- This is actually a bug report.
- I am not getting good LLM Results
- I have tried asking for help in the community on discord or discussions and have not received a response.
- I have tried searching the documentation and have not found an answer.
What Model are you using?
- gpt-3.5-turbo
- gpt-4-turbo
- gpt-4
- Other (please specify)
Describe the bug
Retries in default TOOLS mode returns null
To Reproduce
I have used an example from the docs but updated to pydantic v2:
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.TOOLS)
class UserDetail(BaseModel):
name: str
age: int
@field_validator("name")
@classmethod
def name_must_be_uppercase(cls, name):
if name.islower():
raise ValueError("name must be uppercase")
return name
response = client.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserDetail,
messages=[
{"role": "user", "content": "Extract `jason is 12`"},
],
max_retries=2,
)
print(response.model_dump_json(indent=2))
returns:
OpenAI returns null
and this raises an error of Invalid JSON
Expected behavior
When using JSON mode, things work as expected and I get
{
"name": "JASON",
"age": 12
}
Screenshots
Taken from Langsmith tracing:
Interesting, I wonder if GPT-3.5 is regressing.
can you try
@field_validator("name")
@classmethod
def name_must_be_uppercase(cls, name):
if name.islower():
raise ValueError("name must be uppercase, please correct this")
return name
i.e., just prompting the error message to be a little bit more explicit in fixing the response.
Nope, sorry, I tried it with a bunch of prompts and a bunch of models, even this:
class UserDetail(BaseModel):
name: str
age: int
@field_validator("name")
@classmethod
def name_must_be_uppercase(cls, name):
if name.islower():
raise ValueError("name must be UPPERCASE. Use the tool again but modify the name argument to 'JASON'.")
return name
response = client.chat.completions.create(
model="gpt-3.5-turbo-1106",
response_model=UserDetail,
temperature=0.01,
messages=[
{"role": "user", "content": "Extract `jason is 12`"},
],
max_retries=2,
)
I might try making the instruction come from a "user" rather than a "tool" role, but use TOOL mode.
What are the down sides to JSON mode btw? If this works better for me?
if it works it works!
ok maybe theres something deeper going on rightn ow
OK a small experiment suggests that passing the validation error back manually as a "user" role works better, when still using TOOLS mode.
response = None
max_retries = 2
for i in range(max_retries):
try:
print(messages)
response = await client.chat.completions.create(
model=model,
temperature=temperature,
messages=messages.copy(),
response_model=response_model,
max_retries=0
)
break
except InstructorRetryException as e:
completion: ChatCompletion = e.last_completion
if client.mode == instructor.Mode.TOOLS:
response_json = completion.choices[0].message.tool_calls[0].function.arguments
else:
response_json = completion.choices[0].message.content
messages.append({
"role": "assistant",
"content": response_json
})
messages.append({
"role": "user",
"content": str(e)
})
if response is None:
response = response_model.model_validate_json(response_json)
return response
if you have a moment can you check if our message in retry.py is use or assistant? im surprised maybe we need to improve the prompts.
For mode.TOOLS you use the tool
role to pass back the validation error. This is the obvious choice I think.
I wonder if the models are fine tuned to just summarise the function output rather than expecting errors.