openkruise / kruise-game

Game Servers Management on Kubernetes

Home Page:https://openkruise.io/kruisegame/introduction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feat | Using KubeVela to manage multiple GameServerSet

Somefive opened this issue · comments

Background

In game area, servers are usually grouped into various partitions. Each partition of servers are set up for serving a specific range of players. In each partition, there could be multiple types of servers that provides different services and communicate with each other server in the same partition, like Battle servers or Scene servers.

image

Each type of server could contain multiple replicas and be modeled as a GameServerSet. The GameServerSet will create Open Kruise StatefulSet to pull up pods and each pod will be attached with an additional GameServer object to help set the operate state of the pod.

image
image

Operating/Managing a large amount of GameServerSets across different partitions can be laborious. To help alleviate the burden of repetitive operations for GameServerSets, we could introduce KubeVela to model the higher level application on top of the GameServerSets.

Architecture

In KubeVela, applications are used to model resources and manage their spec & lifecycles. Besides, there are also delivery pipeline that could describe the operation actions as codes and be reused.

Specifically, we could model the GameServerSets in each partition as a single KubeVela application and let it manage the desired state and delivery process of the GameServerSets, such as updates. On top of that, for the whole game, we could add another application to manage the partition applications.

image

With architecture, it will be easy to modify the desired state of the GameServerSets along with the partitions or the types (roles).

image

Implementation

First, to model the partition application, we need a KubeVela ComponentDefinition for the abstraction of GameServerSet. The below CUE templates defines how the GameServerSet is formed, which parameters are exposed and how the health state is evaluated.

"game-server-set": {
	alias: ""
	annotations: {}
	description: "The GameServerSet."
	type:        "component"
    attributes: {
        workload: type: "autodetects.core.oam.dev"
        status: {
            customStatus: #"""
                status: {
                    replicas: *0 | int
                } & {
                    if context.output.status != _|_ {
                        if context.output.status.readyReplicas != _|_ {
                            replicas: context.output.status.readyReplicas
                        }
                    }
                }
                message: "\(context.name): \(status.replicas)/\(context.output.spec.replicas)"
                """#
            healthPolicy: #"""
                status: {
                    replicas: *0 | int
                    generation: *-1 | int
                } & {
                    if context.output.status != _|_ {
                        if context.output.status.readyReplicas != _|_ {
                            replicas: context.output.status.readyReplicas
                        }
                        if context.output.status.observedGeneration != _|_ {
                            generation: context.output.status.observedGeneration
                        }
                    }
                }
                isHealth: (context.output.spec.replicas == status.replicas) && (context.output.metadata.generation == status.generation)
                """#
        }
    }
}

template: {
	parameter: {
        // +usage=The image of the Game Server
        image: string
        // +usage=The number of replicas
        replicas: *1 | int
	}
    output: {
        apiVersion: "game.kruise.io/v1alpha1"
        kind: "GameServerSet"
        spec: {
            updateStrategy: rollingUpdate: podUpdatePolicy: "InPlaceIfPossible"
            gameServerTemplate: spec: containers: [{
                image: parameter.image
                name: "\(context.name)"
            }]
        }
        metadata: name: "\(context.name)"
        spec: replicas: parameter.replicas
    }
}

On top of that, we could have the abstraction of partition applications, which is indicated as game-server-sets as below. The game-server-sets is a template for generating partition application, and exposes the configuration of different types of underlying GameServerSets in the parameter.

"game-server-sets": {
    alias: ""
    annotations: {}
    description: "The Game Server Sets of one region."
    type:        "component"
    attributes: {
        workload: type: "autodetects.core.oam.dev"
        status: {
            customStatus: #"""
                status: {
                    phase: *"initializing" | string
                } & {
                    if context.output.status != _|_ {
                        if context.output.status.status != _|_ {
                            phase: context.output.status.status
                        }
                    }
                }
                message: "\(context.name): \(status.phase)"
                """#
            healthPolicy: #"""
                status: {
                    phase: *"initializing" | string
                    generation: *-1 | int
                } & {
                    if context.output.status != _|_ {
                        if context.output.status.status != _|_ {
                            phase: context.output.status.status
                        }
                        if context.output.status.observedGeneration != _|_ {
                            generation: context.output.status.observedGeneration
                        }
                    }
                }
                isHealth: (status.phase == "running") && (status.generation >= context.output.metadata.generation)
                """#
        }
    }
}

template: {

    #GameServerSet: {
        // +usage=The image of the Game Server
        image: string
        // +usage=The number of replicas
        replicas: *1 | int
        // +usage=The dependencies of the Game Server
        dependsOn: *[] | [...string]
    }

    parameter: [string]: #GameServerSet

    output: {
        apiVersion: "core.oam.dev/v1beta1"
        kind: "Application"
        metadata: name: context.name
        spec: {
            components: [for role, gss in parameter {
                name: "\(context.name)-\(role)"
                type: "game-server-set"
                properties: {
                    image: gss.image
                    replicas: gss.replicas
                }

                _dependsOn: [for d in gss.dependsOn {"\(context.name)-\(d)"}]
                if len(_dependsOn) > 0 {
                    dependsOn: _dependsOn
                }
            }]
            workflow: steps: [{
                type: "deploy"
                name: "deploy"
                properties: policies: []
            }]
        }
    }
}

Finally, we have got the application that manages all the partition application as the user interface, shown as below

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: mmo
  namespace: game
spec:
  components:
    - type: game-server-sets
      name: partition-1
      properties:
        battle:
          image: nginx:1.17
          replicas: 2
        scenes:
          image: nginx:1.20
          replicas: 2
        ai:
          image: nginx:1.21
          replicas: 1
    - type: game-server-sets
      name: partition-2
      dependsOn: ["partition-1"]
      properties:
        battle:
          image: nginx:1.17
          replicas: 3
        scenes:
          image: nginx:1.20
          replicas: 3
        ai:
          image: nginx:1.21
          replicas: 2
    - type: game-server-sets
      name: partition-3
      properties:
        battle:
          image: nginx:1.17
          replicas: 1
        scenes:
          image: nginx:1.20
          replicas: 1
        ai:
          image: nginx:1.21
          replicas: 1
  policies:
    - type: override
      name: global-config
      properties:
        components:
          - properties:
              scenes:
                dependsOn: ["battle"]
    - type: apply-once
      name: apply-once
      properties:
        enable: true
  workflow:
    steps:
      - type: deploy
        name: deploy
        properties:
          policies: ["global-config"]

The dependsOn lines specify the delivery order between different partitions and different types of GameServerSets. In the above example, we requires the update of partition-2 is after partition-1, and the updates of Scenes GameServerSets are after the updates of Battle GameServerSets.

  • To start or stop a partition, we can just add more game-server-sets component to the top layer MMO application.
  • To update the image of a specific type of GameServerSets, we have two ways to do so. One is to directly update the image field in each game-server-sets component's configuration. This allows users to have fine-grained control for the image of each GameServerSet in each partition. The other way is to set the image field in the global-config policy, where users only need to config once and it will take effects across all partitions.
  • To scale the replicas of GameServerSets, the action we need to do is similar to above. We just set the replicas field in the component properties.

Orchestrate

Another action is to set the state of GameServer for the pod, to mark the pod's deletion state, priority and other configurations. This can be achieved with the use of KubeVela WorkflowRun. The reason for using WorkflowRun instead of Application for managing GameServers is because GameServer objects are not directly managed through Applications or GameServerSets. They are post-attached to pods. So using a sideway to manage them can be easier and more simple.

To define the operation behaviour, we use the follow CUE templates. The operate-gs defines the detailed update operation for updating the GameServer object. It first read the GameServer from Kubernetes and re-assemble it with the update fields.

import (
	"vela/op"
)

"operate-gs": {
	type: "workflow-step"
	description: "Operate GameServer."
}
template: {
    #Operation: {
        deletionPriority?: int
        opsState?: "None" | "WaitToBeDeleted"
        updatePriority?: int
    }

    handle: op.#Steps & {
        for gsName, o in parameter {
            "\(gsName)": op.#Steps & {
                read: op.#Read & {
                    value: {
                        apiVersion: "game.kruise.io/v1alpha1"
                        kind:       "GameServer"
                        metadata: {
                            name: gsName
                            namespace: context.namespace
                        }
                    }
                } @step(1)
                apply: op.#Apply & {
                    value: {
                        for k, v in read.value if k != "spec" {
                            "\(k)": v
                        }
                        if read.value.spec != _|_ {
                            spec: {
                                for k, v in read.value.spec {
                                    if k != "deletionPriority" && k != "opsState" && k != "updatePriority" {
                                        "\(k)": v
                                    }
                                }
                                
                                if o.deletionPriority != _|_ {
                                    deletionPriority: o.deletionPriority
                                }
                                if o.deletionPriority == _|_ && read.value.spec.deletionPriority != _|_ {
                                    deletionPriority: read.value.spec.deletionPriority 
                                }

                                if o.opsState != _|_ {
                                    opsState: o.opsState
                                }
                                if o.opsState == _|_ && read.value.spec.opsState != _|_ {
                                    opsState: read.value.spec.opsState 
                                }

                                if o.updatePriority != _|_ {
                                    updatePriority: o.updatePriority
                                }
                                if o.updatePriority == _|_ && read.value.spec.updatePriority != _|_ {
                                    updatePriority: read.value.spec.updatePriority 
                                }
                            }
                        }
                    }
                } @step(2)
            }
        }
    }

	parameter: [string]: #Operation
}

The use of the atomic action is as below

apiVersion: core.oam.dev/v1alpha1
kind: WorkflowRun
metadata:
  name: edit-gs
  namespace: game
spec:
  workflowSpec:
    steps:
      - type: operate-gs
        name: operate-gs
        properties:
          partition-1-scenes-0:
            opsState: WaitToBeDeleted 
          partition-2-battle-1:
            opsState: WaitToBeDeleted
          partition-2-scenes-0:
            deletionPriority: 20

This WorkflowRun is a one-time execution. It sets the opsState to WaitToBeDeleted for the first replica of the Scene GameServerSet in partition 1. Similar behaviors are applied to partition 2.

Extra Resource Relationship

In KubeVela, there is resource topology in KubeVela to be used to display the internal architecture of the application. To help visualize the relationships between GameServerSets and StatefulSets, Pods and GameServers, we could apply the following configuration into the KubeVela system. Then we will be able to visualize the full architecture of the KubeVela application.

apiVersion: v1
kind: ConfigMap
metadata:
  name: game-server-set-relation
  namespace: vela-system
  labels:
    "rules.oam.dev/resource-format": "yaml"
    "rules.oam.dev/resources": "true"
data:
  rules: |-
    - parentResourceType:
        group: game.kruise.io
        kind: GameServerSet
      childrenResourceType:
        - apiVersion: apps.kruise.io/v1beta1
          kind: StatefulSet
        - apiVersion: game.kruise.io/v1alpha1
          kind: GameServer
    - parentResourceType:
        group: apps.kruise.io
        kind: StatefulSet
      childrenResourceType:
        - apiVersion: v1
          kind: Pod