Stuck in "Generating response..."

Question

Stuck in "Generating response..."

marioSiller2712 opened this issue 3 months ago · comments

Describe the bug
Hello,

currently, we are using the repository to access a predefined index from Azure Search. However, it occasionally happens that the chatbot gets stuck in the 'Generating response...' state. Unfortunately, the logs also couldn't provide any further information, such as regarding a timeout. The error only happens as an Azure Web App, not local on my own client.

Has anyone of you experienced something similar and been able to resolve it? Thank you very much and best regards!

Expected behavior
A response in a relatively short time or at least an error message indicating what happened.

Configuration: Please provide the following

Azure OpenAI model name and version: gpt-35-turbo-16k, version 0613
Is chat history enabled? No
Are you using data? If so, what data source? (e.g. Azure AI Search, Azure CosmosDB Mongo vCore, etc): Azure AI Search Index (Vector)

Sean Lambert · Answer 1 · Thu May 02 2024 22:40:15 GMT+0800 (China Standard Time)

I am having the same issue, except with gpt-35-turbo. The "chat" playground works fine, but using the "assistants" playground in the Azure Web App will also get stuck on "generating..." until it times out.

marioSiller2712 · Answer 2 · Thu May 02 2024 22:54:02 GMT+0800 (China Standard Time)

I am having the same issue, except with gpt-35-turbo. The "chat" playground works fine, but using the "assistants" playground in the Azure Web App will also get stuck on "generating..." until it times out.

I guess the problem is the streaming option in the Application settings (AZURE_OPENAI_STREAM). We changed it to "False", now everything works fine.

scottcmoeller · Answer 3 · Fri May 03 2024 22:23:18 GMT+0800 (China Standard Time)

Having the same issue with 2 apps deployed this week. This was not an issue with a previous deployment from a couple weeks ago. Even though the streaming environment variables are set to True, the response does not stream and it only shows 'Generating Response' until the full response appears all at once. Happening for both gpt 3.5 and 4.

Fethedddine · Answer 4 · Tue May 14 2024 10:08:39 GMT+0800 (China Standard Time)

this happened to me also

stevenzhang114 · Answer 5 · Mon May 20 2024 18:44:21 GMT+0800 (China Standard Time)

Having the same issue with 2 apps deployed this week. This was not an issue with a previous deployment from a couple weeks ago. Even though the streaming environment variables are set to True, the response does not stream and it only shows 'Generating Response' until the full response appears all at once. Happening for both gpt 3.5 and 4.

We experienced the same issue of Response is not actually "Streaming", basically just show the final result at the end.
Do you have any successful experience to have result showing as actual streaming?

aashish181188 · Answer 6 · Wed Jul 03 2024 19:30:38 GMT+0800 (China Standard Time)

My observations :
The Streaming variable in .env file is valid when you are using OpenAI Endpoint but if you are using promptflow endpoint then you do not need to enable the stream variable and promptflow can return the streaming.
The existing code does not have logic to handle the streaming from promptflow so we have to write by our own.