Is the signaling server needed after a full WebRTC/SCTP mesh is established?

Question

Is the signaling server needed after a full WebRTC/SCTP mesh is established?

bonsairobo opened this issue a year ago · comments

It seems like the answer is "yes" based on my reading of the code (at least for matchbox_server), but theoretically it seems unnecessary. Perhaps it is necessary for peers to re-connect after getting disconnected, although I bet the public IP discovered through ICE does not change too quickly.

The reason I ask is that I'm trying to plan out the cloud infrastructure for the signaling server. It would be nice if it could live as short as possible to save money. Maybe one of the devs here could recommend a strategy for this.

Garry O'Donnell · Answer 1 · Wed Jul 05 2023 06:03:17 GMT+0800 (China Standard Time)

Your reading of the code is correct, the signalling server largely speaking has two purposes:

To add new peers to a topology (full-mesh, client-server etc.)
To inform peers of disconnects
To perform this role it must be available for the lifetime of the network. Even if we are to assume the ICE information does not change rapidly, there needs to be some way to retrieve this from the remote peer - this is implemented using the signalling server as a relay.

If I were you look into scaling to zero when no session is active (the vast majority of the time for a small game). Most cloud vendors have some support for doing this. In K8s it can be done with KEDA.

There are two, slightly out there, ideas I have had on my mind with respect to minimising user effort around signalling - though this is my first time mentioning and they may be immediately shot down @johanhelsing and @simbleau:

Implementation of a signalling-server-less connection protocol where we ask users to exchange some "invite token" which encodes the necessary information, though we would likely need to respond with another token making this a bit too complicated a flow
We (matchbox devs) host a self service community signalling server where developers can come along and register a game (using GitHub OAuth credentials) and pick from a fixed set of topologies (e.g. full-mesh or client-server) and room strategies (e.g. named or rolling)

Duncan · Answer 2 · Wed Jul 05 2023 06:36:12 GMT+0800 (China Standard Time)

@garryod

To perform this role it must be available for the lifetime of the network. Even if we are to assume the ICE information does not change rapidly, there needs to be some way to retrieve this from the remote peer - this is implemented using the signalling server as a relay.

Makes sense. If a disconnected peer forgets ICE candidates or candidates expire, then losing the signaling server is catastrophic.

Implementation of a signalling-server-less connection protocol where we ask users to exchange some "invite token" which encodes the necessary information, though we would likely need to respond with another token making this a bit too complicated a flow

Do you mean the invite tokens would contain a recent set of ICE candidates somehow? This does sound a bit cumbersome without integration into the players' group messaging app.

We (matchbox devs) host a self service community signalling server where developers can come along and register a game (using GitHub OAuth credentials) and pick from a fixed set of topologies (e.g. full-mesh or client-server) and room strategies (e.g. named or rolling)

Sounds like a jackbox.tv SaaS. That could work assuming it is cheaper than the self-hosting alternative.

If I were you look into scaling to zero when no session is active (the vast majority of the time for a small game). Most cloud vendors have some support for doing this. In K8s it can be done with KEDA.

Sounds like a plan. If I or anyone else implements this, it would be nice to put that IaC template in this repo as an example.

Spencer C. Imbleau · Answer 3 · Sun Jul 23 2023 02:46:07 GMT+0800 (China Standard Time)

I never coded anything in matchbox that relates to the underlying WebRTC protocol and data handling. So I blame @johanhelsing and @garryod for that (wonderful) code ;)

Still though, it makes sense to me we wouldn't need it if we don't plan to reconnect as the client.

Seems useful for client/server topology. Does the client really need to stay in context with the signaling server, if they were only meant to ever discover 1 peer (the server)?

If we used a .attempt_reconnections(false) somewhere on the client builder, I don't think there's much use to keep the signaling connection alive, right?

Johan Klokkhammer Helsing · Answer 4 · Sun Jul 23 2023 05:12:18 GMT+0800 (China Standard Time)

You can re-enter the ice gathering stage, i think. We don't handle it, but i think we probably should.

Spencer C. Imbleau · Answer 5 · Sun Jul 23 2023 05:34:42 GMT+0800 (China Standard Time)

ELI5 why we need that? If we didn't implement reconnecting, the user could simply tear down all their assets and restart. Which I think is fine.

Johan Klokkhammer Helsing · Answer 6 · Sun Jul 23 2023 10:59:54 GMT+0800 (China Standard Time)

I don't fully remember/understand, and the documentation is not very good, but iirc: Network conditions may change, phone going on or off wifi/carrier etc. Take this with a big, big grain of salt, though.