ossrs / srs

SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.

Home Page:https://ossrs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cluster: Origin Cluster for Fault Tolarence and Load Balance.

winlinvip opened this issue · comments

Currently, there can only be one origin server. When multiple edges connect to multiple origin servers, only one origin server can be selected at a given moment. Therefore, if a stream is sent to two out of N (N>=3) origin servers, such as for hot backup, there will always be one origin server without a stream. If an edge connects to this origin server, it will result in no stream. The edge will have to wait because it cannot know that the origin server does not have this stream.

From the perspective of hot backup and load balancing, it is necessary to support multiple origin servers. These origin servers need to communicate and synchronize their states. This way, when an edge connects to an origin server without a stream, the origin server can inform the edge of the correct origin server.

TRANS_BY_GPT3

Just now, when I was taking a dump, I thought of a simple solution for the origin server cluster. It can be independent of a centralized data system and rely on the client to establish flow information.

For example, if there are three origin servers, when the edge server EdgeA does not have any flow to access origin server A, it immediately accesses origin server B. If origin server B has a flow, the client requests origin server B and also informs origin server A about this information. This completes the exchange of information, making the entire origin server system stateless.

Step 1: Client request Origin A, 404 Not Found.
+-------+           +---------+
| EdgeA +------->---+ OriginA |
+-------+           +---------+


Step 2: Client request Origin B, 200 OK.
+-------+           +---------+
| EdgeA +------->---+ OriginB |
+-------+           +---------+

Step 3: Client notify Origin A where the stream is.
+-------+           +---------+
| EdgeA +------->---+ OriginA |
+-------+           +---------+

In this way, when the other edge server EdgeB connects to origin server A, origin server A knows that this flow is on origin server B, so it gives EdgeB a 302 redirect to origin server B. This means that once the information is established, other edge servers only need a single 302 redirect to know which origin server the flow is on.

Step 1: Client request Origin A, 302 to Origin B.
+-------+           +---------+
| EdgeB +------->---+ OriginA |
+-------+           +---------+


Step 2: Client request Origin B, 200 OK.
+-------+           +---------+
| EdgeB +------->---+ OriginB |
+-------+           +---------+

If the edge server finds that the flow does not exist when accessing the second origin server, it informs the first origin server and starts the polling process again. In this system, the worst-case scenario requires polling all the origin servers, but this process can be done very quickly because the network between the edge server and the origin server is generally very good.

Regardless of which origin server crashes or if the flow is pushed to a different origin server, the system will be rebuilt and this process does not require synchronizing all the origin servers.

TRANS_BY_GPT3

Looking forward to everyone submitting PR.

TRANS_BY_GPT3

What PR are you expecting? Seriously~
If you expect PR to work, pigs will fly. Get out of the way, let me handle it myself.

TRANS_BY_GPT3

Figure it out first, then talk about it.

TRANS_BY_GPT3

When the hot standby switch of the origin server is made, will the user experience any lag?

If the anchor is streaming to origin server node A and node A goes down, and then the streaming is switched to node B, will there be any lag perceived by the user during this origin server switch?

TRANS_BY_GPT3

Check the network quality and buffering settings on the playback side.

If the player buffers a few seconds of data, the user side may not experience lag, but there may be a jump in the picture. If the player side does not buffer any data, the picture will freeze first and then continue when data is received.

TRANS_BY_GPT3

Config for 19350:

listen              19350;
max_connections     1000;
daemon              off;
srs_log_tank        console;
pid                 ./objs/origin.cluster.serverA.pid;
http_api {
    enabled         on;
    listen          9090;
}
vhost __defaultVhost__ {
    cluster {
        mode            local;
        origin_cluster  on;
        coworkers       127.0.0.1:9091;
    }
}

Config for 19351:

listen              19351;
max_connections     1000;
daemon              off;
srs_log_tank        console;
pid                 ./objs/origin.cluster.serverB.pid;
http_api {
    enabled         on;
    listen          9091;
}
vhost __defaultVhost__ {
    cluster {
        mode            local;
        origin_cluster  on;
        coworkers       127.0.0.1:9090;
    }
}

Publish stream to 19350:

./objs/ffmpeg/bin/ffmpeg -re -i doc/source.200kbps.768x320.flv -c copy \
    -f flv -y rtmp://127.0.0.1:19350/live/livestream

Then play the stream on 19351, click here, then the player will be redirected to 19350.

image

Logs on 19351, redirect client to 19350:

[2018-02-16 16:08:29.641][Trace][68305][106] RTMP client ip=::ffff:127.0.0.1, fd=9
[2018-02-16 16:08:29.643][Trace][68305][106] complex handshake success
[2018-02-16 16:08:29.643][Trace][68305][106] connect app, tcUrl=rtmp://127.0.0.1:19351/live, pageUrl=http://www.ossrs.net:8085/players/srs_player.html?vhost=www.ossrs.net&stream=livestream&autostart=false, swfUrl=http://www.ossrs.net:8085/players/srs_player/release/srs_player.swf?_version=1.31, schema=rtmp, vhost=127.0.0.1, port=19351, app=live, args=null
[2018-02-16 16:08:29.694][Trace][68305][106] client identified, type=Play, vhost=127.0.0.1, app=live, stream_name=livestream, duration=-1.00
[2018-02-16 16:08:29.694][Trace][68305][106] connected stream, tcUrl=rtmp://127.0.0.1:19351/live, pageUrl=http://www.ossrs.net:8085/players/srs_player.html?vhost=www.ossrs.net&stream=livestream&autostart=false, swfUrl=http://www.ossrs.net:8085/players/srs_player/release/srs_player.swf?_version=1.31, schema=rtmp, vhost=__defaultVhost__, port=19351, app=live, stream=livestream, args=null
[2018-02-16 16:08:29.694][Trace][68305][106] source url=/live/livestream, ip=::ffff:127.0.0.1, cache=1, is_edge=0, source_id=-1[-1]
[2018-02-16 16:08:29.695][Trace][68305][106] http: on_hls ok, url=http://127.0.0.1:9090/api/v1/clusters?vhost=__defaultVhost__&ip=127.0.0.1&app=live&stream=livestream, response={"code":0,"data":{"query":{"ip":"127.0.0.1","vhost":"__defaultVhost__","app":"live","stream":"livestream"},"origin":{"ip":"127.0.0.1","port":19350,"vhost":"__defaultVhost__","api":"127.0.0.1:9090","routers":["127.0.0.1:9090"]}}}
[2018-02-16 16:08:29.695][Trace][68305][106] rtmp: redirect in cluster, url=http://127.0.0.1:9090/api/v1/clusters?vhost=__defaultVhost__&ip=127.0.0.1&app=live&stream=livestream, target=127.0.0.1:19350
[2018-02-16 16:08:29.721][Trace][68305][106] client finished.

We can also start a edge server, which will follow the RTMP302, the config:

listen              1935;
max_connections     1000;
pid                 objs/edge.pid;
daemon              off;
srs_log_tank        console;
vhost __defaultVhost__ {
    cluster {
        mode            remote;
        origin          127.0.0.1:19351;
    }
}

Remark: The edge will try to fetch stream from 19351, then it'll be redirected to 19350.

The config for origin 19350:

listen              19350;
max_connections     1000;
daemon              off;
srs_log_tank        console;
pid                 ./objs/origin.cluster.serverA.pid;
http_api {
    enabled         on;
    listen          9090;
}
vhost __defaultVhost__ {
    cluster {
        mode            local;
        origin_cluster  on;
        coworkers       127.0.0.1:9091;
    }
}

The config for origin 19351:

listen              19351;
max_connections     1000;
daemon              off;
srs_log_tank        console;
pid                 ./objs/origin.cluster.serverB.pid;
http_api {
    enabled         on;
    listen          9091;
}
vhost __defaultVhost__ {
    cluster {
        mode            local;
        origin_cluster  on;
        coworkers       127.0.0.1:9090;
    }
}

Then publish to origin 19350:

./objs/ffmpeg/bin/ffmpeg -re -i doc/source.200kbps.768x320.flv -c copy \
        -f flv -y rtmp://127.0.0.1:19350/live/livestream

Then start player to play stream from edge, click here.

image

The log on edge server, connect to 19350 but redirected to 19350:

[2018-02-16 16:24:36.844][Trace][68543][107] RTMP client ip=::ffff:127.0.0.1, fd=8
[2018-02-16 16:24:36.847][Trace][68543][107] complex handshake success
[2018-02-16 16:24:36.847][Trace][68543][107] connect app, tcUrl=rtmp://127.0.0.1:1935/live, pageUrl=http://www.ossrs.net:8085/players/srs_player.html?app=live&stream=livestream&server=127.0.0.1&port=1935&autostart=true&vhost=127.0.0.1, swfUrl=http://www.ossrs.net:8085/players/srs_player/release/srs_player.swf?_version=1.31, schema=rtmp, vhost=127.0.0.1, port=1935, app=live, args=null
[2018-02-16 16:24:36.902][Trace][68543][107] client identified, type=Play, vhost=127.0.0.1, app=live, stream_name=livestream, duration=-1.00
[2018-02-16 16:24:36.902][Trace][68543][107] connected stream, tcUrl=rtmp://127.0.0.1:1935/live, pageUrl=http://www.ossrs.net:8085/players/srs_player.html?app=live&stream=livestream&server=127.0.0.1&port=1935&autostart=true&vhost=127.0.0.1, swfUrl=http://www.ossrs.net:8085/players/srs_player/release/srs_player.swf?_version=1.31, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=livestream, args=null
[2018-02-16 16:24:36.903][Trace][68543][107] source url=/live/livestream, ip=::ffff:127.0.0.1, cache=1, is_edge=1, source_id=-1[-1]
[2018-02-16 16:24:36.903][Trace][68543][107] dispatch cached gop success. count=0, duration=-1
[2018-02-16 16:24:36.903][Trace][68543][107] create consumer, queue_size=30.00, jitter=1
[2018-02-16 16:24:36.903][Trace][68543][107] ignore disabled exec for vhost=__defaultVhost__
[2018-02-16 16:24:36.903][Trace][68543][107] mw changed sleep 350=>350, max_msgs=128, esbuf=218750, sbuf 146988=>109375, realtime=0
[2018-02-16 16:24:36.903][Trace][68543][107] start play smi=0.00, mw_sleep=350, mw_enabled=1, realtime=0, tcp_nodelay=0
[2018-02-16 16:24:36.904][Trace][68543][107] update source_id=108[108]
[2018-02-16 16:24:36.904][Trace][68543][107] -> PLA time=0, msgs=0, okbps=0,0,0, ikbps=0,0,0, mw=350
[2018-02-16 16:24:36.907][Trace][68543][108] complex handshake success.
[2018-02-16 16:24:36.907][Trace][68543][108] connected, dsu=1
[2018-02-16 16:24:36.908][Trace][68543][108] edge change from 100 to state 101 (pull).
[2018-02-16 16:24:36.910][Warn][68543][108][35] RTMP redirect 127.0.0.1:19351 to 127.0.0.1:19350 stream=

Fixed.

Please help to test this feature.

WIKI:
Please ensure that you maintain the markdown structure.

https://github.com/ossrs/srs/wiki/v3_CN_OriginCluster

https://github.com/ossrs/srs/wiki/v3_EN_OriginCluster

TRANS_BY_GPT3

Thank you. Happy New Year.

TRANS_BY_GPT3

The design goal of Origin Cluster is a cluster with less than 5k streams or for disaster recovery with a small number of streams. If you need a cluster with 100k streams, please refer to #1607 (comment).

In this solution, each origin server accesses each other, which means that each origin server is an independent service. Since each origin server needs to serve the edge or be accessed, each origin server needs to have a service address. There are two ways to achieve this:

  • Stateless origin server cluster: 1~3 origin servers, each requiring the creation of a separate Deployment and Service. The advantage is that it is stateless and does not require interconnection, resulting in higher stability.
  • Stateful origin server cluster: 3~30 origin servers, only requiring the creation of a single StatefulSet and Service. The advantage is that it has a simple configuration, but the downside is that managing state can be complex. After creation, only a few fields such as Replicas, Template, and UpdateStrategy can be updated.

In both of the above scenarios, it is necessary to configure the "coworkers" for the origin server and the "origin" for the edge server, including the addresses of all the origin servers.

Stateless Origin Server Cluster (Deployment)

Suitable for very few streams, such as <100 streams, 1-3 origin servers.

The origin server cluster of SRS itself is stateful, which means that requesting a certain stream must be done on a specific server, rather than being able to pull the stream from any server. We cannot attach multiple origin servers behind an SLB (Server Load Balancer), as when playing a stream, the SLB will randomly select an origin server, which may lead to accessing the wrong server or the stream's status and data being located on a specific origin server, rather than being stateless.

So, when we talk about a stateless origin server cluster here, it refers to the deployment of the origin server cluster in the form of a stateless application. Since each origin server requires an independent deployment, each deployment has only one replica, and each deployment corresponds to a service (ClusterIP) with a unique name. In reality, it is equivalent to having only one origin server behind the SLB, for example:

Origin Server Deployment Service Domain
--- --- --- ---
Origin Server 0 origin-0-deploy origin-0-service origin-0-service
Origin Server 1 origin-1-deploy origin-1-service origin-1-service
Origin Server 2 origin-2-deploy origin-2-service origin-2-service
Origin Server N origin-N-deploy origin-N-service origin-N-service

Note: For the deployment instances of stateless clusters, refer to the Wiki.

Create a separate Deployment for each origin server with Replicas set to 1. Create a corresponding Service with ClusterIP type. This approach may be a bit cumbersome, but it will be easier to migrate to OCM (#1607) in the future. The origin server will be addressed using the service-name instead of the pod-name.service-name method.

StatefulSet for Stateful Origin Server Cluster

Suitable for a certain number of streams, such as <5k streams, and within 5-30 origin servers.

In K8s, each origin server requires a responsive Service, which can be achieved by using StatefulSets and HeadlessService to enable addressing capability for each origin server Pod. For example:

Origin Server StatefulSet Service Domain
--- --- --- ---
Origin Server 0 origin service origin-0.service
Origin Server 1 origin service origin-1.service
Origin Server 2 origin service origin-2.service
Origin Server N origin service origin-N.service

Note: For deployment instances of stateful clusters, refer to the Wiki.

Just create one StatefulSet and one Service, and set the Replicas to the number of origin servers.

The origin servers are configured as coworkers: origin-0.service origin-1.service origin-2.service;
The edge servers are configured with all or some of the origin servers: origin origin-0.service origin-1.service origin-2.service;

It can be seen that it will indeed be cumbersome, and when adding new origin servers, it is necessary to update the configurations of other origin servers as well as the edge servers. This solution is suitable for up to 30 origin server nodes.

TRANS_BY_GPT3

The origin server cluster supports a solution for less than 5k routes. Please refer to: #464 (comment)

The origin server cluster supports a solution for less than 100k routes. Please refer to: #1607 (comment)

Regarding the definition of the service address for the origin server in the origin server cluster, please refer to: #1501 (comment)

Regarding the round-robin issue with multiple node origin servers in the origin server cluster, please refer to: #1501 (comment)

Regarding the storage issue with the origin server cluster, please refer to: #1595 (comment)

Regarding the API issue with the origin server cluster, please refer to: #1607 (comment)

TRANS_BY_GPT3

In the example of StatefulSet in K8s, there is an example of deploying a Cassandra cluster, which is a type of KV storage. Since the names and addresses of the Pods are fixed, the first one is chosen as the SeedNode, which means that all the nodes will gossip with this node.

          - name: CASSANDRA_SEEDS
            value: "cassandra-0.cassandra.default.svc.cluster.local"

Note: Here is an article introducing Cassandra: https://www.cnblogs.com/loveis715/p/5299495.html

Simply put, Cassandra is also a cluster composed of a group of nodes, but it has a more complex communication mechanism that distinguishes roles such as SeedNode.

The Origin Cluster does not want to implement such a complex logic. The future direction is to solve this part of the mechanism by relying on peripheral services through the HTTP API. For example, Go can be used to implement OCM (#1607), and Go can rely on KV to solve the central data storage problem.

TRANS_BY_GPT3

OriginCluster needs to support the same configuration for easy deployment in K8s, so that it can access itself without causing any issues, but optimization is required. Refer to #1608.

TRANS_BY_GPT3

In the upgrade, rollback, and grayscale mechanisms of the service, the origin server or origin server cluster can be directly restarted or improved by batch restart. This is mainly because the origin server generally has an edge as a proxy, and the edge will retry after disconnection, which has a minimal impact on users. Reference: #1579 (comment)

TRANS_BY_GPT3