bug: OpenAI API Call exceeds maximum token context length for Model

Question

bug: OpenAI API Call exceeds maximum token context length for Model

MR-Alex42 opened this issue a year ago · comments

I'm running gptdeploy with model GPT 3.5, so my maximum token context length is limited to 4097 tokens.
The API call currently does not consider model dependent token context length limits. I suggest to include error handling for this so gptdeploy continues to run. Maybe counting tokens (see https://platform.openai.com/tokenizer) in advance, splitting or shortening the prompt messeages can be workarounds for this issue.

Traceback (most recent call last):
File "D:\Programme\gptdeploy\gptdeploy.py", line 5, in
main()
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1130, in call
return self.main(*args, **kwargs)
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "D:\Programme\Python310\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "D:\Programme\gptdeploy\src\cli.py", line 38, in wrapper
return func(*args, **kwargs)
File "D:\Programme\gptdeploy\src\cli.py", line 74, in generate
generator.generate(path)
File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 289, in generate
self.generate_microservice(microservice_path, microservice_name, packages, num_approach)
File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 102, in generate_microservice
test_microservice_content = self.generate_and_persist_file(
File "D:\Programme\gptdeploy\src\options\generate\generator.py", line 71, in generate_and_persist_file
content_raw = conversation.chat(f'You must add the content for {file_name}.')
File "D:\Programme\gptdeploy\src\apis\gpt.py", line 121, in chat
response = self._chat([self.system_message] + self.messages)
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\base.py", line 128, in call
return self.generate(messages, stop=stop).generations[0].message
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 252, in generate
for stream_resp in self.completion_with_retry(
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 228, in completion_with_retry
return completion_with_retry(**kwargs)
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 289, in wrapped_f
return self(f, *args, **kw)
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 379, in call
do = self.iter(retry_state=retry_state)
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 314, in iter
return fut.result()
File "D:\Programme\Python310\lib\concurrent\futures_base.py", line 451, in result
return self.__get_result()
File "D:\Programme\Python310\lib\concurrent\futures_base.py", line 403, in __get_result
raise self.exception
File "D:\Programme\Python310\lib\site-packages\tenacity_init.py", line 382, in call
result = fn(*args, **kwargs)
File "D:\Programme\Python310\lib\site-packages\langchain\chat_models\openai.py", line 226, in _completion_with_retry
return self.client.create(**kwargs)
File "D:\Programme\Python310\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "D:\Programme\Python310\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 620, in _interpret_response
self._interpret_response_line(
File "D:\Programme\Python310\lib\site-packages\openai\api_requestor.py", line 683, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4118 tokens. Please reduce the length of the messages.

Florian Hönicke · Answer 1 · Sun Apr 23 2023 00:08:06 GMT+0800 (China Standard Time)

Thanks for reporting. I was mostly using GPT-4.
I wonder how we could solve this.
Even if we determine the token length, what should it do if it is too much?
I guess we need to make the prompts shorter in general.
I will think about something.

Joschka Braun · Answer 2 · Tue May 02 2023 16:50:19 GMT+0800 (China Standard Time)

This should become less often an issue due to: #70
Will close the ticket for now.