llama_cpp.server -- Input should be [...] and 500 Internal Server Error

Question

llama_cpp.server -- Input should be [...] and 500 Internal Server Error

billstei opened this issue a month ago · comments

I am using Extended OpenAI Conversation in HA 2024.4.3, and running a llama-cpp-python server 0.2.63 in a Python venv with:

(LCP-Blas) bill@AVS3:~$ python3 -m llama_cpp.server --model LLMs/functionary-7b-v2.1-GGUF/functionary-7b-v2.1.q4_0.gguf --chat_format functionary-v2 --hf_pretrained_model_name_or_path LLMs/functionary-7b-v2.1-GGUF --host 0.0.0.0 --port 3928

The HA Voice Assist pipeline that I configured in HA is working fine with this server for basic non-function-call questions like "What is the capital of Florida?" (using the HA text Assist interface only), but when I do: "Turn off entity input_boolean.mytoggle", it does turn off that (helper) entity successfully, but llama_cpp.server also produces an error(s) afterwards that is (and note that I added some whitespace formatting below to make it easier to human-read) as follows:

Exception: [
{'type': 'literal_error', 'loc': ('body', 'messages', 2, 'typed-dict', 'role'), 'msg': "Input should be 'system'", 'input': 'assistant', 'ctx': {'expected': "'system'"}},
{'type': 'missing', 'loc': ('body', 'messages', 2, 'typed-dict', 'content'), 'msg': 'Field required', 'input': {'role': 'assistant', 'function_call': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'tool_calls': [{'id': 'call_WK6Ek9a9FsctmoYppvvOJeeD', 'function': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}},

{'type': 'literal_error', 'loc': ('body', 'messages', 2, 'typed-dict', 'role'), 'msg': "Input should be 'user'", 'input': 'assistant', 'ctx': {'expected': "'user'"}},
{'type': 'missing', 'loc': ('body', 'messages', 2, 'typed-dict', 'content'), 'msg': 'Field required', 'input': {'role': 'assistant', 'function_call': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'tool_calls': [{'id': 'call_WK6Ek9a9FsctmoYppvvOJeeD', 'function': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}},
{'type': 'missing', 'loc': ('body', 'messages', 2, 'typed-dict', 'content'), 'msg': 'Field required', 'input': {'role': 'assistant', 'function_call': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'tool_calls': [{'id': 'call_WK6Ek9a9FsctmoYppvvOJeeD', 'function': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}},

{'type': 'literal_error', 'loc': ('body', 'messages', 2, 'typed-dict', 'role'), 'msg': "Input should be 'tool'", 'input': 'assistant', 'ctx': {'expected': "'tool'"}},
{'type': 'missing', 'loc': ('body', 'messages', 2, 'typed-dict', 'content'), 'msg': 'Field required', 'input': {'role': 'assistant', 'function_call': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'tool_calls': [{'id': 'call_WK6Ek9a9FsctmoYppvvOJeeD', 'function': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}},
{'type': 'missing', 'loc': ('body', 'messages', 2, 'typed-dict', 'tool_call_id'), 'msg': 'Field required', 'input': {'role': 'assistant', 'function_call': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'tool_calls': [{'id': 'call_WK6Ek9a9FsctmoYppvvOJeeD', 'function': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}},

{'type': 'literal_error', 'loc': ('body', 'messages', 2, 'typed-dict', 'role'), 'msg': "Input should be 'function'", 'input': 'assistant', 'ctx': {'expected': "'function'"}},
{'type': 'missing', 'loc': ('body', 'messages', 2, 'typed-dict', 'content'), 'msg': 'Field required', 'input': {'role': 'assistant', 'function_call': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'tool_calls': [{'id': 'call_WK6Ek9a9FsctmoYppvvOJeeD', 'function': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}},
{'type': 'missing', 'loc': ('body', 'messages', 2, 'typed-dict', 'name'), 'msg': 'Field required', 'input': {'role': 'assistant', 'function_call': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'tool_calls': [{'id': 'call_WK6Ek9a9FsctmoYppvvOJeeD', 'function': {'arguments': '{"list": [{"domain": "input_boolean", "service": "turn_off", "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}}
]

Traceback (most recent call last):
File "/home/bill/LCP-Blas/lib/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
response = await original_route_handler(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bill/LCP-Blas/lib/python3.11/site-packages/fastapi/routing.py", line 315, in app
raise validation_error

fastapi.exceptions.RequestValidationError: [{'type': 'literal_error', 'loc': ('body', 'messages', 2, 'typed-dict', 'role'), 'msg': "Input should be 'system'", 'input': 'assistant', 'ctx': {'expected': "'system'"}}
[... REPEATS INFO ABOVE, REMOVED FOR THIS GITHUB POST ...]
, "service_data": {"entity_id": "input_boolean.mytoggle"}}]}', 'name': 'execute_services'}, 'type': 'function'}]}}]
INFO: 192.168.1.98:34900 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

This entire error message body repeats 3 times, I assume because there were 3 "Llama.generate: prefix-match hit"s in llama_cpp.server for a request that needs a function call.

Let me know if the "from_string grammar:" from the 1st prefix-match hit in the server is useful for debugging, and I can include it here.

Jens Leinenbach · Answer 1 · Wed Apr 24 2024 03:57:58 GMT+0800 (China Standard Time)

This is what ChatGPT 4 tells you about this error:

The error message you provided indicates several key issues related to how the API expects data to be structured and how it's being provided in the request. Let’s break down the primary issues as identified in the exception messages:

1. **Role Field Mismatch**: The `'role'` in the JSON payload is set to `'assistant'` when the expected roles seem to be `'system'`, `'user'`, `'tool'`, and `'function'` in various contexts. The role needs to be corrected based on the expected usage within the API.

2. **Missing Content Field**: There is a consistent issue with a missing `'content'` field in parts of your payload. This suggests that the structure of the data being passed is incomplete according to the API's requirements.

3. **Missing Tool Call ID and Name**: The exceptions also point out that the `'tool_call_id'` and `'name'` fields are missing in some parts of your JSON. These fields are likely necessary for the API to correctly process the request.

The overall error stems from the discrepancies between what your request includes (or omits) and what the API expects. The `"POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error` status code further implies that these validation errors are leading to a failure in processing the request.

billstei · Answer 2 · Sat Apr 27 2024 11:27:04 GMT+0800 (China Standard Time)

After doing some testing using OpenAI's example code "Example invoking multiple function calls in one response" from here: https://platform.openai.com/docs/guides/function-calling I have concluded that this bug is not in Extended OpenAI Conversation integration, because the example code, running completely independently of HA or this integration, has the same problem (after modifying it to use llama_cpp.server, as above). Referring to that example code, a workaround was to change the "role": "tool", (line 70) to "role": "assistant", but that is not a real fix, just an observation for debugging. The problem seems to be in openai/_base_client.py

Closing this issue #203