timoklimmer / powerproxy-aoai

Monitors and processes traffic to and from Azure OpenAI endpoints.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error message when calling deployed powerproxy

sterankin opened this issue · comments

I will preface this by saying its likely and issue with our Azure settings and infrastructure, but I am having trouble locating the issue and where potentially to look to find clues.

I pulled the latest and deployed using the powershell script and made a request to the endpoint a few times, I receive the following:

"message": "Could not find any endpoint or deployment with remaining capacity. Try again later."

This is somewhat misleading when I look at the code, as I don't think its a geniune 429 (the OpenAI instance has plenty of capacity for the model and is not under heave use).

powerproxy.py:

# raise 429 if we could not find any suitable endpoint if aoai_response is None: raise ImmediateResponseException( Response( content=json.dumps( {"message": "Could not find any endpoint or deployment with remaining capacity. Try again later."} ), media_type="application/json", status_code=status.HTTP_429_TOO_MANY_REQUESTS, ) )

So does this mean it will throw a 429 on a null or empty response?

However, the reason i think its an issue with our subscription or RG, is that the exact code and config yaml works locally without issue. Its only when its deployed to Azure that we get the message. The endpoints and api keys are identical, and I am calling it in Postman the same way.

I tried looking in the container app logs (ContainerAppConsoleLogs_CL and ContainerAppSystemLogs_CL) and metrics, but I can't see any errors.

Any tips or ideas for troubleshooting this one? Could there be a private endpoint on our OpenAI instance which would prevent this, or some sort of IAM permission needed?

Hi @sterankin, the "artificial" 429 you mention is returned when no suitable target was found. I am intentionally sending a 429 to indicate clients to retry. Not optimal, but I haven't found a better way yet.

One idea I have is that the last endpoint/standin has a non_streaming_fraction other than 1 configured (1 is default). In that case, there might be no target for the request by chance, leading to the 429. I will check if I can modify the config validation to avoid wrong non_streaming_fraction configs.

If non_streaming_fraction is not the problem: I have added some more console output for troubleshooting purposes into main. Maybe that will help.

Before that, you could also compare the outputs of PowerProxy when it starts (local vs. Azure) to ensure it's actually the same.

Added an additional config check to avoid that endpoints or standins which are last in the list have a non_streaming_fraction other than 1 if defined.

Issue should be solved. If not, please let me know.