connectrpc / connect-es

The TypeScript implementation of Connect: Protobuf RPC that works.

Home Page:https://connectrpc.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Timeout when reading from stream in firefox 119

Dartoxian opened this issue · comments

Describe the bug

When reading from a stream for over a minute in Firefox 119 (snap) on ubuntu 22 I am receiving the following error:

[unknown] Error in input stream
ConnectError@http://localhost:3000/static/js/bundle.js:151541:5
from@http://localhost:3000/static/js/bundle.js:151574:14
abort@http://localhost:3000/static/js/bundle.js:154743:75

On chrome on the same machine I do not receive this error and the stream is read correctly.

To Reproduce

Read from a stream (for over a minute) in Firefox 119 on ubuntu 22.

Environment:

  • @connectrpc/connect-web version: 1.1.3 (and 0.13.0)
  • Frontend framework and version: React 18, Typescript 4.4.2
  • Browser and version: Firefox 119

Additional context
Add any other context about the problem here.

Hey @Dartoxian, I just modified the example slightly:

diff --git a/packages/example/src/server.ts b/packages/example/src/server.ts
index ada61ca..9aacea1 100644
--- a/packages/example/src/server.ts
+++ b/packages/example/src/server.ts
@@ -35,14 +35,12 @@ function routes(router: ConnectRouter) {
     },
     async *introduce(req: IntroduceRequest) {
       yield { sentence: `Hi ${req.name}, I'm eliza` };
-      await delay(250);
-      yield {
-        sentence: `Before we begin, ${req.name}, let me tell you something about myself.`,
-      };
-      await delay(150);
-      yield { sentence: `I'm a Rogerian psychotherapist.` };
-      await delay(150);
-      yield { sentence: `How are you feeling today?` };
+      for (let i = 0; ; i++) {
+          await delay(1000);
+          yield {
+              sentence: `message ${i}`,
+          };
+      }
     },
     async *converse(reqs: AsyncIterable<ConverseRequest>) {
       for await (const req of reqs) {

After starting the server with npm start, visiting https://localhost:8443 in the browser will run the server-streaming RPC, sending a message every second.

This worked in Chrome 118 and FF 119 for me (I'm seeing "message 470" here). I'm not on ubuntu.

Can you try to reproduce the issue with this setup?

The most likely reason for a timeout is that the browser, the server, or something in between has decided to close an apparently idle connection. This can happen if you don't send any messages for a longer period.

@timostamm Thanks for sharing that - I've checked out that example on my side, and have confirmed in FF that it can run successfully for several minutes. I've tried upping the delay to 60s and still successfully receive messages.

Given that I do not experience this problem with chrome in our application, I think the issue must be in my firefox somewhere. When I run firefox with no extensions this problem persists. Do you have any suggestions for how I could debug this further?

Hey @Dartoxian when it comes to the browsers the connection management is up to them and we can't influence that. This means each browser could have its own defaults for things like say idle timeout. One way to debug this is to tweak these settings on your browser but that can help us confirm the issue but we may not be able to change the behavior. It also depends on how the data is being streamed for example how many messages are you sending within the 60 min frame and at what interval you are sending them.

The best way to solve such an issue is to build a retry mechanism. The retry mechanism could be different for different APIs like say you are streaming a large dataset the API could accept a resume/range token or if it is to receive adhoc notifications it could just be a simple retry.

Let us know if you need any help in implementing the retry.

Hey @srikrsna-buf , thanks for the detail - totally understand that the browser connection management is out of your hands. How can I confirm that the issue is definitely down to a timeout? I thought that if it was due to a timeout the ConnectError that I received would indicate that, rather than declaring unknown.

I agree a retry/resume behaviour would be good for the application I am working on, and something like that is in the works I believe.

I thought that if it was due to a timeout the ConnectError that I received would indicate that, rather than declaring unknown.

Because there is no standard error indicating that the browser closed the connection we can't recognize the error as a timeout error. So we return an unknown instead.

How can I confirm that the issue is definitely down to a timeout?

The stack trace you shared is of the bundled code, anyway you can use source maps to give us the exact stack trace that indicates the file and line of the connect package?