terraform-aws-modules / terraform-aws-eks

Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦

Home Page:https://registry.terraform.io/modules/terraform-aws-modules/eks/aws

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

vrathore18 opened this issue · comments

I just updated terraform from 0.11 to 0.12. Since then I started getting errors.

Below is my module eks, Kubernetes provider, and Helm provider.

module "eks" {
  source = "terraform-aws-modules/eks/aws"
  version = "~> 12.2"

  cluster_name                                 = var.name
  subnets                                      = module.vpc.private_subnets
  vpc_id                                       = module.vpc.vpc_id
  cluster_version                              = var.cluster_version
  kubeconfig_aws_authenticator_additional_args = ["-r", "arn:aws:iam::${var.target_account_id}:role/terraform"]

  worker_groups = [
    {
      instance_type        = var.eks_instance_type
      asg_desired_capacity = var.eks_asg_desired_capacity
      asg_max_size         = var.eks_asg_max_size
      key_name             = var.key_name
      autoscaling_enabled  = true
      subnets              = element(module.vpc.private_subnets, 0)
      tags = [
        {
          "key"                 = "k8s.io/cluster-autoscaler/enabled"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
        {
          "key"                 = "k8s.io/cluster-autoscaler/${var.name}"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
      ]
    },
  ]

  map_accounts = [var.target_account_id]
  manage_aws_auth = true
  enable_irsa                          = true
  write_kubeconfig                     = false

  map_roles = [
    {
      rolearn = format("arn:aws:iam::%s:role/admin", var.target_account_id)
      username = format("%s-admin", var.name)
      groups    = ["system:masters"]
    }
  ]

}

Kubernetes provider:

data aws_eks_cluster cluster {
  name  = module.eks.cluster_id
}

data aws_eks_cluster_auth cluster {
  name  = module.eks.cluster_id
}

provider kubernetes {
  host                   = element(concat(data.aws_eks_cluster.cluster[*].endpoint, list("")), 0)
  cluster_ca_certificate = base64decode(element(concat(data.aws_eks_cluster.cluster[*].certificate_authority.0.data, list("")), 0))
  token                  = element(concat(data.aws_eks_cluster_auth.cluster[*].token, list("")), 0)
  load_config_file       = false
  version                = "1.13.1"
}

Helm Provider

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
    exec {
      api_version = "client.authentication.k8s.io/v1alpha1"
      args        = ["eks", "get-token", "--cluster-name", var.name]
      command     = "aws"
    }
  }
}

While running Terraform plan. I am getting below errors. Initally I used to create .kube_config.yaml and pass it into the providers but now I am not even able to create the .kube_config.yaml :

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Error: Get "http://localhost/apis/rbac.authorization.k8s.io/v1/clusterrolebindings/tiller": dial tcp 127.0.0.1:80: connect: connection refused

Error: Get "http://localhost/api/v1/namespaces/kube-system/serviceaccounts/tiller": dial tcp 127.0.0.1:80: connect: connection refused

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Up! I'm having the same issue!

same issue

Have you tried setting KUBE_CONFIG_PATH? I was getting the same error, but export KUBE_CONFIG_PATH=/path/to/kubeconfig seems to fix the issue.

I found a "workaround".

I was getting this error when I was running terraform destroy locally (i.e. not using CI). Via CI it works.

In my case, I have to replace:

provider "kubectl" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
}

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
    token                  = data.aws_eks_cluster_auth.cluster.token
    load_config_file       = false
  }
}

By:

 provider "helm" {
  kubernetes {
    config_path = "~/.kube/config"
  }
}

 provider "kubectl" {
  kubernetes {
    config_path = "~/.kube/config"
  }
}

Then I can successfully run terraform destroy. Once I'm done, I rollback my changes.

This seems to me like a bug as host should be taken into account.

Sincerely,
Cpt. Obvious

Also affects me

Same here, erroring for terraform destroy so executed this to destroy:

1. export KUBE_CONFIG_PATH=~/.kube/config
2. terraform destroy
commented

Datapoint: tf13, chart import, same problem, same fix (KUBE_CONFIG_PATH)

frustrating because I also run into connectivity issues also solved by configuring without the file, so how to do both?

Guys - we have the same issue with TF 0.14.7. Does anyone even bother to take a look at this issue since February?

Same issue

Same problem while terraform destroy

So I ran into this issue today and this article helped explain why we're all facing the issue. I hope this helps someone

Thanks @edeediong for that link, as the answer is in the comments:

Fortunately we can avoid deleting the cluster by removing from the tfstate the resources created by the provider helm or kubernetes:
$ terraform state rm [resource_id]

In my case, I had the metrics-server Helm chart installed via Helm provider.

The article referenced by @edeediong was indeed helpful. When I tried using any of the environmental variable solutions when applying changes (across both the infra and the workloads), the plans did not look right -- it was claiming resources had been deleted.

I had separated the infra and the workloads into 2 different modules already, so the solution of using two different apply commands was straightforward.

Does this put additional burden on the user to plan out applies that cut across infra and workload? Will the user be in the situation that to make a major change/migration smooth could require more than just an apply to each of infra and workload?

commented

Same issue here, super annoying on Terraform v1.0.0

here the workaround export KUBE_CONFIG_PATH=~/.kube/config does help

Hi

for my windows environment it helped when I added the following in powershell.

[Environment]::SetEnvironmentVariable("KUBE_CONFIG_PATH", "~/.kube/config")

Then it knew where to find my configuration

Regards

Setting export KUBECONFIG=~/.kube/config for standard K3S Cluster, Fixed the issue.

The simplest solution is to remove kubernetes resources from the state file and then run the terraform destroy.

In my case - I performed the terraform destroy and it delete few resources and I lost the rback to the eks, so terraform could not delete the other kubernetes resources like jenkins helm chart.

So i removed those resources from terraform state file (because anyways, if cluster is gone i don't care about installed helm charts), and then i ran terraform distroy.

Ex -
terraform state list
terraform state rm helm_release.jenkins
terraform distroy

commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

commented

This issue has been automatically closed because it has not had recent activity since being marked as stale.

Why this was closed without any resolution?

Why this was closed without any resolution?

there was no update since 9th August were was provided some hint. If there is no discussion over issue then stale bot is this closing.

If you feel some question is still valid please open a new issue and describe full case how to replicate problem

Have the same issue while using Helm with EKS on TF 1.0.10. Setting env variable KUBE_CONFIG_PATH helped.

I have the same issue with AKS. Can anyone reopen it?

I have the same issue with AKS. Can anyone reopen it?

this modules is for AWS

I understand, but from the comments, it looks like the issue was not solved.

This seems to be a problem with the Kubernetes provider, which this module (and the AKS ones out there) make use of.

I'd recommend folks looking for fixes/workarounds to check out this issue in the provider repo.

commented

This happened with me for the first time this week after switching to 0.14.7 (working my way up).
It only happened locally and the fix from @lpossamai worked.

The annoying thing is that my configuration is already split in two with my EKS cluster provisioned separately from the resources that are provisioned into the EKS cluster. The provider lookup comes from a data block so I am not sure if I can apply the suggestions from the linked article above. :/

In my case I was adding dependency in auth module to creation resource module, for example, I have the gke auth module in terraform and the gke module, my gke_auth seems:

module "gke_auth" {
  source       = "terraform-google-modules/kubernetes-engine/google//modules/auth"
  depends_on   = [module.gke]
  project_id   = var.project_id
  location     = module.gke.location
  cluster_name = module.gke.name
}

So I remove the depends_on and works

the kube_config is only an option if you actually use kubectl to manage anything on the server, many of us dont or are creating 1 click setups for this

Is there a way to solve this issue without setting KUBE_CONFIG_PATH? i want to deploy EKS + LoabBalancerController. But to deploy LBC i need to set KUBE_CONFIG_PATH. After deploying EKS, i need to create config file, only after that i can deploy LBC. I want to deploy it without manual actions.

I'm upgrading from a really old TF spec, and I'm getting this problem when trying to create the EKS config fresh.
Since the old config had write_kubeconfig = false and the newest module version does not have write_kubeconfig, how do I get around that?

commented

This is still a problem and the issue should be re-opened IMO

The solution is to have 2 separate states for infrastructure and services applied by helm
It is a shitty solution but the best available right now

This means 2 plan & 2 apply

You then have to use remote state to fetch anything from the other state which also sucks but it means you can at least have things talking to each other.

@voycey I don't understand what you mean by that. Can you elaborate?

I think the issue here is that in not every situation the KUBE_CONFIG_PATH can be used. If you create a fresh EKS cluster, the terraform runs ok without any KUBE_CONFIG_PATH. But then if you run a change suddenly plan starts to fail because it cannot read back some IRSA resource(s) - to do that it needs a correct config. In principle the workaround is simple, right just set an env variable and run plan&apply again. But this cannot be applied in all cases. What about some CD integration?

I cannot create a kubeconfig file before the plan if the cluster does not exists. For example in a Github workflow in the Terraform plan step I need to implement branch logic

  • if the EKS exists (i.e. the change is only an update) - download kube config, set env var, and run plan.
  • but if the code runs first time, obviously I cannot download a kube config and I do not need to set env. So it makes the flow unnecessary complex.

I do not understand why the provider cannot resolve this on-the-fly? If it is already doing changes on EKS why the subsequent update of the state needs an external kube config?

In my case it fails like this in GithubActions:

Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp 127.0.0.1:80: connect: connection refused
  with module.eks.kubernetes_config_map_v1_data.aws_auth[0],
  on .terraform/modules/eks/main.tf line 437, in resource "kubernetes_config_map_v1_data" "aws_auth":
 437: resource "kubernetes_config_map_v1_data" "aws_auth" {

Is this error caused by sort of "expired token" ?
I have several providers which related to eks cluster:

const helmProvider = new HelmProvider(this, "helm_provider", <HelmProviderConfig>{
      provider,
      kubernetes: <HelmProviderKubernetes>{
        host: eksConstruct.eksModule.clusterEndpointOutput,
        clusterCaCertificate: Fn.base64decode(eksConstruct.eksModule.clusterCertificateAuthorityDataOutput),
        token: eksClusterAuth.token,
      }
    });

I know that module's outputs stores in state file, do we have a way to refresh it before building plan?

UPD
Adding KUBE_CONFIG_PATH changed errors to
Kubernetes cluster unreachable: the server has asked for the client to provide credentials

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.