MatrixAI / js-ws

Websocket Transport Library for TypeScript/JavaScript Applications

Home Page:https://polykey.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WebSocket Behaviour Alignment with Transport Layer

tegefaulkes opened this issue · comments

Specification

This issue is an epic for tracking the pending work for the Websockets code.

Specifically an extraction of the behaviour to js-ws which behave similarly to js-quic.

Additional context

Tasks

  • 1. Test the backpressure from both client and server
  • 2. Ensure WebSocketStream exists and we can mux and demux on top of a single WebSocketConnection
  • 3. Make WebSocketConection equivalent to QUICConnection, but as a 1 to 1 to WebSocket

@tegefaulkes I think you're also missing the backpressure problem. That should also be done here.

Relevant commentary MatrixAI/Polykey#525 (comment) on websocket backpressure.

Backpressure needs to be properly applied to the readable streams. I may need to make a new issue for this. This needs to be further specced out.

https://chat.openai.com/share/8cb1acc0-88e9-4846-89f3-2bd3840dba6f

Yes, you're correct. The process to establish a WebSocket connection starts with an HTTP request, specifically an HTTP GET request that includes an "Upgrade: websocket" header. This is often referred to as the WebSocket handshake request.

The client sends this initial HTTP request to the server, requesting to upgrade the connection to a WebSocket. The request might look something like this:

GET /mychat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Origin: http://example.com

If the server supports the WebSocket protocol, it agrees to the upgrade and replies with an HTTP 101 status code (Switching Protocols), something like this:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat

After this handshake process, the connection is upgraded from HTTP to a WebSocket connection, and data can be sent as WebSocket frames, in both directions, between client and server.

This handshake process is necessary because it allows servers to handle both WebSocket and HTTP traffic on the same URL or port, which is especially useful if you want to add real-time features (which WebSockets are great for) to an existing HTTP-based application. It's also an integral part of the WebSocket protocol as defined in its specification.

I'm working on the PKE reactivity system

And I found that websocket transport actually means there's 3 layers of authentication possible.

  1. MTLS at the initial connection
  2. HTTP request/response handshake can have http related authentication
  3. Websocket payload authentication - this appears to be what we are doing right now inside JSONRPC

In relation to our RPC system which is transport agnostic, it does make sense that our authentication at the RPC layer would have to be in the layer 3 payload protocol, because it's not aware of what the transport is. Same in QUIC where it's just a quic stream, but the quic connection may have done an MTLS earlier.

It does mean that when you're doing the RPC authentication, the need to setup as a "basic token" format as per the discussion on the PR is sort of unnecessary.

Since in that case we are simulating a http request header which is not required anymore.

I think the case of using websockets as a transport layer, we would skip layer 2, because quic doesn't have as a common denominator, but there's always layer 1.

Did you ensure that our current PK clients communicate over WSS to the PK agent server?

Putting our client service auth into layer 3 means that the the transport layer already establishes a websocket connection from TCP to HTTP to Websockets. Note that websocket frames are always binary.

If they fail authentication we need to close the stream gracefully.

Regarding graceful websocket closing on the server side. I see a pattern like this (this is taken out of PKE's websocket handler):

    // A SocketStream is a Duplex stream with the `socket` object
    wsHandler: (conn: SocketStream, request) => {
      // This enables the websocket handler on
      // `enterprise.polykey.com/`
      // AND also on
      // `org1.org2.enterprise.polykey.com/`.
      // At some point we need to do authentication
      // And this is just a demonstration for now

      logger.info('Websocket Connection Received');
      // If this is false, then we actually don't need to end auto
      // But if it is true, we do need to!
      logger.info(`Websocket Half Open ${conn.allowHalfOpen}`);

      conn.on('drain', () => {
        logger.info(`Resume websocket conn`);
        conn.resume();
      });

      conn.on('data', (msg: Buffer) => {

        console.log('INPUT DATA', JSON.stringify(msg.toString('utf-8')));

        // If this is true, it is fully drained, otherwise backpressure!
        const drained = conn.write(
          JSON.stringify({
            message: 'hello world'
          }) + '\n'
        );
        logger.info(`Wrote to websocket conn, do we need to drain ${drained}, buffer size: ${conn.socket.bufferedAmount}`);
        if (!drained) {
          logger.info(`Pausing websocket conn`);
          conn.pause();
        }
      });
      conn.on('end', () => {
        // This is only necessary if the socket has "allowHalfOpen"
        logger.info('Websocket Connection Ended');
        // We must end the connection!
        conn.end();
      });
    }

Notice that upon receiving end, we call end, this is because allowHalfOpen defaults to true.

These situations we dealt with earlier when we were working on the network/Proxy.ts. At the same time when we are ending, like conn.end, we are supposed to wait for the other side to also end for a proper terminal handshake. However we can "timeout" this wait and close/destroy if they don't send us back that end. The point is that fin packets are supposed to be exchanged at all times.

Notice that on the server side, handling backpressure is quite easy. It's just a matter of writing, checking if the drained boolean is true, if true, pause the connection's readable side, and when we have drained, we can resume the readable side.

This is because this is just an echo handler, where the writable data is being generated in response to the readable data.

If the writable data isn't being generated in response to readable data, then the backpressure has to push back all the way to whoever is calling the writing function. That's usually done just by providing a promise that the caller should be awaiting on.

So the above example is a better demonstration of how to actually apply backpressure when reading from the websocket stream.

However I believe you should have this already done when converting websocket connections to our web stream abstraction right?

I haven't had the time to fully review your entire websocket transport implementation and of the corresponding quic integration yet, so we're just moving ahead to the 6th deployment of testnet and seeing if there are issues that pop up in production, but these things should have been detailed in the spec too during implementation.

This may lead to factoring out both rpc to js-rpc and ws to js-ws.

As per discussion in PKE. The js-ws might just be wrapper around ws combined with a web stream IO abstraction around the websocket object. This enables us to re-use both js-ws and js-rpc inside PKE.

In the future the underlying ws can be swapped out in js-ws for a runtimeless implementation that can be compatible with mobile operating systems.

Right now neither ws nor uws natively support websocket over HTTP2.

Without this, right now all websocket connections require an independent TCP socket, an HTTP request and a HTTP 101 response, before getting a websocket connection. A WSS connection also requires TLS.

This adds quite significant overhead to creating new websocket connections especially since all of our RPC transactions are one to one for websocket connections.

Now it turns out that nodejs actually support the necessary protocol behaviour for HTTP2 to support this. The chrome browser also supports native websocket over http2. However firefox appears to have disabled it.

But the main problem is ws, which would be used for now for PK CLI to PK Agent, or any usage of PolykeyClient.

This is unlikely to ever be done by ws as you can see in this thread: websockets/ws#1458 (comment)

However I found https://github.com/szmarczak/http2-wrapper/blob/master/examples/ws/index.js that appears to do it. It uses node's http2 module, and then wraps it in such a way that websockets will use HTTP2 streams. If this hack is sufficient, it would be possible to encapsulate this hack inside js-ws when we extract the websocket transport out.

It's also possible that by the time ws gets around to doing it over http2, ws over http3 would be the main thing: https://www.rfc-editor.org/rfc/rfc9220.html.

For js-ws, which we intend to use Polykey it will primarily focus on node runtime, meaning server side. Browser side it doesn't make sense to provide anything except an abstraction of the WebSocket to a web stream, that is we would want to create a class WebSocketStream to replicate QUICStream in js-quic. That would js-ws could still be used in the browser, but the underlying websocket implementation would just be whatever the browser provides.

Full alignment is not possible until #2 is achieved.

For now WebSocketStream can be done later. Maybe we can do it this month not sure.

Instead this epic will focus only on what's left here. Further WS work will occur in js-ws.

Since you removed clientMaxReadableStreamBytes in MatrixAI/Polykey#535, does that mean backpressure has been tested?

It's a bit tricky to test since it's an internal implementation detail. But I did see back-pressure happening while testing. I can double check and attempt to make a test for it.

A test should be made to check whether it applies on both ends. @amydevs can do this during the WS extraction.

@amydevs please have this and #4 fixed by #5, and please comment here on how this was solved based on today's discussion.

The backpressure has been revised, speced, and implemented in #5. Instead of calling .pause on ws.WebSocket, we are doing backpressure per stream, and forgoing backpressure on the entire connection. The reason for this is 'pausing' the entire WebSocketConnection will unnecessary throttle all WebSocketStreams where backpressure should only need to be applied per-stream.

Done with #5.