The Cluster Control Plane Machine Set Operator (CPMSO) is responsible for the lifecycle of Control Plane Machines within an OpenShift High Availability cluster, sometimes know as an IPI or UPI cluster.
The operator is configured using a ControlPlaneMachineSet custom resource which is a singleton within the cluster. It defines the desired number of Control Plane Machines and their specification. From there, the CPMSO can maintain the state of the Control Plane and, when required, perform updates by replacing the Machines within the Control Plane in a safe and controlled manner, respecting the principles of immutable infrastructure.
The ControlPlaneMachineSet (enhancement) is intended to enable vertical scaling of Control Plane Machines through a safe and controlled rolling update strategy. Currently, horizontal scaling of the Control Plane is not supported and any updates to the replica count of the Control Plane will be rejected.
The ControlPlaneMachineSet compares the existing specification of each Control Plane Machine with that of the desired
specification defined with the ControlPlaneMachineSet spec.template
. Should any Machine not meet the desired
specification, this Machine is considered to need an update and the CPMSO will then attempt to update it by replacing it
based on the update strategy.
There are two update strategies supported by the ControlPlaneMachineSet, these are RollingUpdate
and OnDelete
.
The RollingUpdate
strategy is an automatic rolling update similar to that of a Deployment within Kubernetes.
This option is suitable for most deployments of the ControlPlaneMachineSet.
The alternative, OnDelete
is a semi-automated replacement which requires Admin intervention to trigger the replacement
of each Machine. When a Control Plane Machine is deleted by the Admin, the ControlPlaneMachineSet will then replace it
and handle the migration of the workloads from the deleted Machine onto the new Machine.
The OnDelete
strategy can be used to have more control over the rollout of updates should there be concerns about the
changes from the Admin making the changes.
A key part of updating the Control Plane Machines is ensuring that the etcd cluster quorum is protected. The ControlPlaneMachineSet is not responsible for the quorum of etcd and the etcd operator manages this using its own protection mechanism. The protection mechanism ensures that Machine API cannot remove a Machine from the cluster until the etcd member on it has been migrated onto a new Machine.
- Go language 1.18+
- GNU make
Standard development tasks can be performed using the Makefile in the root of the repository.
Tooling is vendored and executed using go run
so no additional tooling should be needed for these Make targets.
make build
: Build the operator binary intobin/manager
make test
: Run the project tests. Tests are written using ginkgo and the ginkgo tooling is used to execute the tests. In CI, this target outputs JUnit and code coverage reports. Depends ongenerate
,fmt
andvet
tasks.make unit
: Run only the project tests, without thegenerate
,fmt
, orvet
tasks. TheGINKGO_EXTRA_ARGS
environment variable can be used to pass more options to the ginkgo runner.make lint
: Runsgolangci-lint
based on the project linter configuration. It is recommend to run this target before committing any code changes.make vendor
: Update the vendor directory when there are changes to thego.mod
file. Runs tidy, vendor and verify.
The project follows a conventional Go project layout with cmd and pkg folders.
The business logic for the project is divided into 4 key areas:
- pkg/controllers: The main controllers of the CPMSO, handling the core functionality of gathering status information and handling update decisions.
- pkg/machineproviders: Machine abstractions responsible for handling all specifics for the Machine backend. Responsible for gathering data about and creating and deleting Machines.
- pkg/test: A series of test utilities used within the test suites in the project.
- pkg/webhooks: The validation webhook implemented here validates the ControlPlaneMachineSet resource on create and update operations with a cluster.
More detail about these areas can be found within the code comments.
The CPMSO is deployed as a part of the core OpenShift payload from OpenShift 4.12 onwards.
It is deployed based on the manifests
defined in this repository.
Installation instructions for control plane machine set can be found in the installation docs.
Please review the dedicated contributing guide for code conventions and other pointers on contributing to this project.