rancher / rancher-docs

Rancher Documentation

Home Page:https://ranchermanager.docs.rancher.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AWS cloud provider pages need clarification

dkeightley opened this issue · comments

Summary

  • The mention of providerID and correct hostname can be confusing, context and readability could be improved to note what it is and why, what scenario requires it and how to check if you need it etc.
  • The helm values for the controller-manager should tolerate any taint so that any condition doesn't prevent the cloud-controller pod from starting during initial provisioning (and blocking provisioning progress). For eg, the etcd role toleration should at least be covered, as many clusters will combine etcd/control-plane roles, eg:
tolerations:
- effect: NoExecute
  operator: "Exists"
- effect: NoSchedule
  operator: "Exists"
  • Is the machineLabelSelector needed to differentiate flags between node roles? For eg, setting these all once under machineGlobalConfig or without a selector may reduce the steps and chance of issues?
  • Perhaps an update to the TF docs would be helpful, eg:
resource "rancher2_cluster_v2" "v2-cluster" {

     [...]

  rke_config {
    additional_manifest = <<-EOF
      apiVersion: helm.cattle.io/v1
      kind: HelmChart
      metadata:
        name: aws-cloud-controller-manager
        namespace: kube-system
      spec:
        chart: aws-cloud-controller-manager
        repo: https://kubernetes.github.io/cloud-provider-aws
        # version: 0.0.7
        targetNamespace: kube-system
        bootstrap: true
        valuesContent: |-
          tolerations:
            - effect: NoExecute
              operator: "Exists"
            - effect: NoSchedule
              operator: "Exists"
          nodeSelector:
            node-role.kubernetes.io/control-plane: "true"
          hostNetworking: true
          args:
            - --configure-cloud-routes=false
            - --v=2
            - --cloud-provider=aws
    EOF
    machine_selector_config {
      config = {
        cloud-provider-name = "aws"
        disable-cloud-controller = true
        kubelet-arg = "cloud-provider=external"
        kube-apiserver-arg = "cloud-provider=external"
        kube-controller-manager-arg = "cloud-provider=external"
      }
    }
  • Note, when using this mixture of aws and external (to avoid hostname/providerID issues) the result is that the flag is set twice on components. This seems to work because external is the last flag, is this something that needs to be addressed in case of potential issues? eg:
 kube-controller-manager [...] --cloud-provider=aws --cloud-config= [...] --cloud-provider=external --cluster-cidr=10.42.0.0/16

I agree with @dkeightley. I had to bang my head a lot to understand why aws-cloud-controller-manager was not running on my control node(running also etcd), and the issue was that I was missing a tolaration for etcd role. The documentation should clarify that.
I hope also that the out of tree controller installation will be simplified and made default, since it's the default controller after k8s 1.27.