Too long prompt

Question

Too long prompt

rosieks opened this issue 7 months ago · comments

Context / Scenario

I'm asking question while connected to PostgreSQL
Prompt is 25692 length
CompletionOptions.MaxTokens is 300 (if that matters)

What happened?

I got the following error:

Content:
      {
  "error": {
    "message": "This model's maximum context length is 4096 tokens. However, your messages resulted in 5818 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

Importance

edge case

Platform, Language, Versions

.NET/C#/0.26.240116.2

Relevant log output

info: SimpleChat[0]
      Function SimpleChat invoking.
info: Microsoft.SemanticKernel.Connectors.OpenAI.AzureOpenAIChatCompletionService[0]
      Prompt tokens: 616. Completion tokens: 21. Total tokens: 637.
info: Ask[0]
      Function Ask invoking.
fail: Ask[0]
      Function failed. Error: This model's maximum context length is 4096 tokens. However, your messages resulted in 5818 tokens. Please reduce the length of the messages.
      Status: 400 (model_error)
      ErrorCode: context_length_exceeded

      Content:
      {
  "error": {
    "message": "This model's maximum context length is 4096 tokens. However, your messages resulted in 5818 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}


      Headers:
      Access-Control-Allow-Origin: REDACTED
      X-Content-Type-Options: REDACTED
      x-ratelimit-remaining-requests: REDACTED
      apim-request-id: REDACTED
      x-ratelimit-remaining-tokens: REDACTED
      X-Request-ID: REDACTED
      ms-azureml-model-error-reason: REDACTED
      ms-azureml-model-error-statuscode: REDACTED
      x-ms-client-request-id: dd6bf4cb-0d37-444c-8925-b67e56e5d070
      x-ms-region: REDACTED
      azureml-model-session: REDACTED
      Strict-Transport-Security: REDACTED
      Date: Fri, 26 Jan 2024 10:15:45 GMT
      Content-Length: 281
      Content-Type: application/json

      Azure.RequestFailedException: This model's maximum context length is 4096 tokens. However, your messages resulted in 5818 tokens. Please reduce the length of the messages.
      Status: 400 (model_error)
      ErrorCode: context_length_exceeded

      Content:
      {
  "error": {
    "message": "This model's maximum context length is 4096 tokens. However, your messages resulted in 5818 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}


      Headers:
      Access-Control-Allow-Origin: REDACTED
      X-Content-Type-Options: REDACTED
      x-ratelimit-remaining-requests: REDACTED
      apim-request-id: REDACTED
      x-ratelimit-remaining-tokens: REDACTED
      X-Request-ID: REDACTED
      ms-azureml-model-error-reason: REDACTED
      ms-azureml-model-error-statuscode: REDACTED
      x-ms-client-request-id: dd6bf4cb-0d37-444c-8925-b67e56e5d070
      x-ms-region: REDACTED
      azureml-model-session: REDACTED
      Strict-Transport-Security: REDACTED
      Date: Fri, 26 Jan 2024 10:15:45 GMT
      Content-Length: 281
      Content-Type: application/json

         at Azure.Core.HttpPipelineExtensions.ProcessMessageAsync(HttpPipeline pipeline, HttpMessage message, RequestContext requestContext, CancellationToken cancellationToken)
         at Azure.AI.OpenAI.OpenAIClient.GetChatCompletionsStreamingAsync(ChatCompletionsOptions chatCompletionsOptions, CancellationToken cancellationToken)
         at Microsoft.KernelMemory.AI.AzureOpenAI.AzureOpenAITextGenerator.GenerateTextAsync(String prompt, TextGenerationOptions options, CancellationToken cancellationToken)+MoveNext()
         at Microsoft.KernelMemory.AI.AzureOpenAI.AzureOpenAITextGenerator.GenerateTextAsync(String prompt, TextGenerationOptions options, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource<System.Boolean>.GetResult()
         at Microsoft.KernelMemory.Search.SearchClient.AskAsync(String index, String question, ICollection`1 filters, Double minRelevance, CancellationToken cancellationToken)
         at Microsoft.KernelMemory.Search.SearchClient.AskAsync(String index, String question, ICollection`1 filters, Double minRelevance, CancellationToken cancellationToken)
         at Microsoft.KernelMemory.MemoryPlugin.AskAsync(String question, String index, Double minRelevance, ILoggerFactory loggerFactory, CancellationToken cancellationToken)
         at Microsoft.SemanticKernel.KernelFunctionFromMethod.<>c.<<GetReturnValueMarshalerDelegate>b__12_4>d.MoveNext()      --- End of stack trace from previous location ---
         at Microsoft.SemanticKernel.KernelFunction.InvokeAsync(Kernel kernel, KernelArguments arguments, CancellationToken cancellationToken)
info: Ask[0]
      Function completed. Duration: 103.442043s

Devis Lucato · Answer 1 · Wed Jan 31 2024 05:51:07 GMT+0800 (China Standard Time)

hi @rosieks I'm not ure how you're using the code, is it via the service or the embedded serverless memory? in the model configuration, did you set MaxTokenTotal to 4096?

Sławomir Rosiek · Answer 2 · Wed Jan 31 2024 18:16:12 GMT+0800 (China Standard Time)

I'm using serverless mode. Actually right now I set MaxTokenTotal to 4096 so instead of error I get: "INFO NOT FOUND"

Devis Lucato · Answer 3 · Tue Feb 06 2024 04:34:22 GMT+0800 (China Standard Time)

Looks like setting MaxTokenTotal fixed the problem. About "INFO NOT FOUND" that would be because the search for relevant text finds no results.