Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Question

Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

vrathore18 opened this issue 4 years ago · comments

I just updated terraform from 0.11 to 0.12. Since then I started getting errors.

Below is my module eks, Kubernetes provider, and Helm provider.

module "eks" {
  source = "terraform-aws-modules/eks/aws"
  version = "~> 12.2"

  cluster_name                                 = var.name
  subnets                                      = module.vpc.private_subnets
  vpc_id                                       = module.vpc.vpc_id
  cluster_version                              = var.cluster_version
  kubeconfig_aws_authenticator_additional_args = ["-r", "arn:aws:iam::${var.target_account_id}:role/terraform"]

  worker_groups = [
    {
      instance_type        = var.eks_instance_type
      asg_desired_capacity = var.eks_asg_desired_capacity
      asg_max_size         = var.eks_asg_max_size
      key_name             = var.key_name
      autoscaling_enabled  = true
      subnets              = element(module.vpc.private_subnets, 0)
      tags = [
        {
          "key"                 = "k8s.io/cluster-autoscaler/enabled"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
        {
          "key"                 = "k8s.io/cluster-autoscaler/${var.name}"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
      ]
    },
  ]

  map_accounts = [var.target_account_id]
  manage_aws_auth = true
  enable_irsa                          = true
  write_kubeconfig                     = false

  map_roles = [
    {
      rolearn = format("arn:aws:iam::%s:role/admin", var.target_account_id)
      username = format("%s-admin", var.name)
      groups    = ["system:masters"]
    }
  ]

}

Kubernetes provider:

data aws_eks_cluster cluster {
  name  = module.eks.cluster_id
}

data aws_eks_cluster_auth cluster {
  name  = module.eks.cluster_id
}

provider kubernetes {
  host                   = element(concat(data.aws_eks_cluster.cluster[*].endpoint, list("")), 0)
  cluster_ca_certificate = base64decode(element(concat(data.aws_eks_cluster.cluster[*].certificate_authority.0.data, list("")), 0))
  token                  = element(concat(data.aws_eks_cluster_auth.cluster[*].token, list("")), 0)
  load_config_file       = false
  version                = "1.13.1"
}

Helm Provider

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
    exec {
      api_version = "client.authentication.k8s.io/v1alpha1"
      args        = ["eks", "get-token", "--cluster-name", var.name]
      command     = "aws"
    }
  }
}

While running Terraform plan. I am getting below errors. Initally I used to create .kube_config.yaml and pass it into the providers but now I am not even able to create the .kube_config.yaml :

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Error: Get "http://localhost/apis/rbac.authorization.k8s.io/v1/clusterrolebindings/tiller": dial tcp 127.0.0.1:80: connect: connection refused

Error: Get "http://localhost/api/v1/namespaces/kube-system/serviceaccounts/tiller": dial tcp 127.0.0.1:80: connect: connection refused

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

Michael T Halligan commented 3 years ago

+1

Sergey Muha commented 3 years ago

+1

hossein-rasi commented 3 years ago

+1

Lucas Possamai · Answer 1 · Tue Feb 23 2021 06:19:09 GMT+0800 (China Standard Time)

Up! I'm having the same issue!

Tyler Bartlett · Answer 2 · Thu Feb 25 2021 09:24:43 GMT+0800 (China Standard Time)

same issue

Vassil Hristov · Answer 3 · Mon Mar 01 2021 21:08:36 GMT+0800 (China Standard Time)

Have you tried setting KUBE_CONFIG_PATH? I was getting the same error, but export KUBE_CONFIG_PATH=/path/to/kubeconfig seems to fix the issue.

Lucas Possamai · Answer 4 · Tue Mar 02 2021 04:12:40 GMT+0800 (China Standard Time)

I found a "workaround".

I was getting this error when I was running terraform destroy locally (i.e. not using CI). Via CI it works.

In my case, I have to replace:

provider "kubectl" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
}

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
    token                  = data.aws_eks_cluster_auth.cluster.token
    load_config_file       = false
  }
}

By:

 provider "helm" {
  kubernetes {
    config_path = "~/.kube/config"
  }
}

 provider "kubectl" {
  kubernetes {
    config_path = "~/.kube/config"
  }
}

Then I can successfully run terraform destroy. Once I'm done, I rollback my changes.

Yves Vogl · Answer 5 · Mon Mar 08 2021 20:49:21 GMT+0800 (China Standard Time)

This seems to me like a bug as host should be taken into account.

Sincerely,
Cpt. Obvious

ovorobiov1 · Answer 6 · Fri Mar 12 2021 17:36:25 GMT+0800 (China Standard Time)

Also affects me

Denis Astahov · Answer 7 · Wed Mar 17 2021 06:25:10 GMT+0800 (China Standard Time)

Same here, erroring for terraform destroy so executed this to destroy:

1. export KUBE_CONFIG_PATH=~/.kube/config
2. terraform destroy

Dis · Answer 8 · Sat Apr 17 2021 05:08:46 GMT+0800 (China Standard Time)

Datapoint: tf13, chart import, same problem, same fix (KUBE_CONFIG_PATH)

Ashton Kinslow · Answer 9 · Fri Apr 23 2021 00:38:28 GMT+0800 (China Standard Time)

frustrating because I also run into connectivity issues also solved by configuring without the file, so how to do both?

Andrey Kanevsky · Answer 10 · Tue May 04 2021 22:05:28 GMT+0800 (China Standard Time)

Guys - we have the same issue with TF 0.14.7. Does anyone even bother to take a look at this issue since February?

Martice Nicks · Answer 11 · Thu May 13 2021 09:11:54 GMT+0800 (China Standard Time)

Same issue

Aytunc BEKEN · Answer 12 · Thu May 13 2021 11:33:05 GMT+0800 (China Standard Time)

Same problem while terraform destroy

Edidiong Etuk · Answer 13 · Fri May 14 2021 20:35:59 GMT+0800 (China Standard Time)

So I ran into this issue today and this article helped explain why we're all facing the issue. I hope this helps someone

Paul Benetis · Answer 14 · Sun May 23 2021 22:50:11 GMT+0800 (China Standard Time)

Thanks @edeediong for that link, as the answer is in the comments:

Fortunately we can avoid deleting the cluster by removing from the tfstate the resources created by the provider helm or kubernetes:
$ terraform state rm [resource_id]

In my case, I had the metrics-server Helm chart installed via Helm provider.

Chris Roat · Answer 15 · Fri Jun 18 2021 16:15:37 GMT+0800 (China Standard Time)

The article referenced by @edeediong was indeed helpful. When I tried using any of the environmental variable solutions when applying changes (across both the infra and the workloads), the plans did not look right -- it was claiming resources had been deleted.

I had separated the infra and the workloads into 2 different modules already, so the solution of using two different apply commands was straightforward.

Does this put additional burden on the user to plan out applies that cut across infra and workload? Will the user be in the situation that to make a major change/migration smooth could require more than just an apply to each of infra and workload?

Ward · Answer 16 · Fri Jun 18 2021 16:16:07 GMT+0800 (China Standard Time)

Same issue here, super annoying on Terraform v1.0.0

here the workaround export KUBE_CONFIG_PATH=~/.kube/config does help

Ohan Smit · Answer 17 · Wed Jun 30 2021 19:53:46 GMT+0800 (China Standard Time)

Hi

for my windows environment it helped when I added the following in powershell.

[Environment]::SetEnvironmentVariable("KUBE_CONFIG_PATH", "~/.kube/config")

Then it knew where to find my configuration

Regards

shailesh83 · Answer 18 · Sun Jul 11 2021 08:10:47 GMT+0800 (China Standard Time)

Setting export KUBECONFIG=~/.kube/config for standard K3S Cluster, Fixed the issue.

Snehalkumar Mahale · Answer 19 · Mon Aug 09 2021 15:04:46 GMT+0800 (China Standard Time)

The simplest solution is to remove kubernetes resources from the state file and then run the terraform destroy.

In my case - I performed the terraform destroy and it delete few resources and I lost the rback to the eks, so terraform could not delete the other kubernetes resources like jenkins helm chart.

So i removed those resources from terraform state file (because anyways, if cluster is gone i don't care about installed helm charts), and then i ran terraform distroy.

Ex -
terraform state list
terraform state rm helm_release.jenkins
terraform distroy

stale · Answer 20 · Wed Sep 15 2021 01:20:30 GMT+0800 (China Standard Time)

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · Answer 21 · Wed Sep 22 2021 02:33:04 GMT+0800 (China Standard Time)

This issue has been automatically closed because it has not had recent activity since being marked as stale.

Vladimir Mukhin · Answer 22 · Thu Oct 14 2021 22:58:01 GMT+0800 (China Standard Time)

Why this was closed without any resolution?

Dawid Rogaczewski · Answer 23 · Thu Oct 14 2021 23:24:26 GMT+0800 (China Standard Time)

Why this was closed without any resolution?

there was no update since 9th August were was provided some hint. If there is no discussion over issue then stale bot is this closing.

Dawid Rogaczewski · Answer 24 · Thu Oct 14 2021 23:25:01 GMT+0800 (China Standard Time)

If you feel some question is still valid please open a new issue and describe full case how to replicate problem

kolyaiks · Answer 25 · Sat Nov 06 2021 08:27:09 GMT+0800 (China Standard Time)

Have the same issue while using Helm with EKS on TF 1.0.10. Setting env variable KUBE_CONFIG_PATH helped.

Orgad Shaneh · Answer 26 · Sun Nov 28 2021 15:28:13 GMT+0800 (China Standard Time)

I have the same issue with AKS. Can anyone reopen it?

Dawid Rogaczewski · Answer 27 · Mon Nov 29 2021 18:53:13 GMT+0800 (China Standard Time)

I have the same issue with AKS. Can anyone reopen it?

this modules is for AWS

Orgad Shaneh · Answer 28 · Mon Nov 29 2021 18:54:44 GMT+0800 (China Standard Time)

I understand, but from the comments, it looks like the issue was not solved.

Joey Davenport · Answer 29 · Wed Dec 08 2021 07:13:52 GMT+0800 (China Standard Time)

This seems to be a problem with the Kubernetes provider, which this module (and the AKS ones out there) make use of.

I'd recommend folks looking for fixes/workarounds to check out this issue in the provider repo.

Diom · Answer 30 · Sat Mar 05 2022 02:17:27 GMT+0800 (China Standard Time)

This happened with me for the first time this week after switching to 0.14.7 (working my way up).
It only happened locally and the fix from @lpossamai worked.

The annoying thing is that my configuration is already split in two with my EKS cluster provisioned separately from the resources that are provisioned into the EKS cluster. The provider lookup comes from a data block so I am not sure if I can apply the suggestions from the linked article above. :/

Thiago Crespo · Answer 31 · Fri Mar 25 2022 21:11:39 GMT+0800 (China Standard Time)

In my case I was adding dependency in auth module to creation resource module, for example, I have the gke auth module in terraform and the gke module, my gke_auth seems:

module "gke_auth" {
  source       = "terraform-google-modules/kubernetes-engine/google//modules/auth"
  depends_on   = [module.gke]
  project_id   = var.project_id
  location     = module.gke.location
  cluster_name = module.gke.name
}

So I remove the depends_on and works

Dan Voyce · Answer 32 · Mon May 16 2022 11:30:08 GMT+0800 (China Standard Time)

the kube_config is only an option if you actually use kubectl to manage anything on the server, many of us dont or are creating 1 click setups for this

Alibek Ismailov · Answer 33 · Fri May 27 2022 13:00:21 GMT+0800 (China Standard Time)

Is there a way to solve this issue without setting KUBE_CONFIG_PATH? i want to deploy EKS + LoabBalancerController. But to deploy LBC i need to set KUBE_CONFIG_PATH. After deploying EKS, i need to create config file, only after that i can deploy LBC. I want to deploy it without manual actions.

demi-dianthus · Answer 34 · Sat Jun 11 2022 01:47:09 GMT+0800 (China Standard Time)

I'm upgrading from a really old TF spec, and I'm getting this problem when trying to create the EKS config fresh.
Since the old config had write_kubeconfig = false and the newest module version does not have write_kubeconfig, how do I get around that?

Bryant Biggs · Answer 35 · Sat Jun 11 2022 01:51:33 GMT+0800 (China Standard Time)

@demi-dianthus https://awscli.amazonaws.com/v2/documentation/api/latest/reference/eks/update-kubeconfig.html

Drew · Answer 36 · Wed Jun 29 2022 21:08:56 GMT+0800 (China Standard Time)

This is still a problem and the issue should be re-opened IMO

Dan Voyce · Answer 37 · Wed Jun 29 2022 21:13:56 GMT+0800 (China Standard Time)

The solution is to have 2 separate states for infrastructure and services applied by helm
It is a shitty solution but the best available right now

This means 2 plan & 2 apply

You then have to use remote state to fetch anything from the other state which also sucks but it means you can at least have things talking to each other.

demi-dianthus · Answer 38 · Wed Jun 29 2022 21:17:47 GMT+0800 (China Standard Time)

@voycey I don't understand what you mean by that. Can you elaborate?

Dan Voyce · Answer 39 · Wed Jun 29 2022 21:19:40 GMT+0800 (China Standard Time)

https://itnext.io/terraform-dont-use-kubernetes-provider-with-your-cluster-resource-d8ec5319d14a

Information here

Bryant Biggs · Answer 40 · Wed Jun 29 2022 21:20:00 GMT+0800 (China Standard Time)

official documentation https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#stacking-with-managed-kubernetes-cluster-resources

Tomas Farkas · Answer 41 · Fri Jul 01 2022 17:17:44 GMT+0800 (China Standard Time)

I think the issue here is that in not every situation the KUBE_CONFIG_PATH can be used. If you create a fresh EKS cluster, the terraform runs ok without any KUBE_CONFIG_PATH. But then if you run a change suddenly plan starts to fail because it cannot read back some IRSA resource(s) - to do that it needs a correct config. In principle the workaround is simple, right just set an env variable and run plan&apply again. But this cannot be applied in all cases. What about some CD integration?

I cannot create a kubeconfig file before the plan if the cluster does not exists. For example in a Github workflow in the Terraform plan step I need to implement branch logic

if the EKS exists (i.e. the change is only an update) - download kube config, set env var, and run plan.
but if the code runs first time, obviously I cannot download a kube config and I do not need to set env. So it makes the flow unnecessary complex.

I do not understand why the provider cannot resolve this on-the-fly? If it is already doing changes on EKS why the subsequent update of the state needs an external kube config?

In my case it fails like this in GithubActions:

Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp 127.0.0.1:80: connect: connection refused
  with module.eks.kubernetes_config_map_v1_data.aws_auth[0],
  on .terraform/modules/eks/main.tf line 437, in resource "kubernetes_config_map_v1_data" "aws_auth":
 437: resource "kubernetes_config_map_v1_data" "aws_auth" {

Anton Osenenko · Answer 42 · Mon Jul 04 2022 20:34:50 GMT+0800 (China Standard Time)

Is this error caused by sort of "expired token" ?
I have several providers which related to eks cluster:

const helmProvider = new HelmProvider(this, "helm_provider", <HelmProviderConfig>{
      provider,
      kubernetes: <HelmProviderKubernetes>{
        host: eksConstruct.eksModule.clusterEndpointOutput,
        clusterCaCertificate: Fn.base64decode(eksConstruct.eksModule.clusterCertificateAuthorityDataOutput),
        token: eksClusterAuth.token,
      }
    });

I know that module's outputs stores in state file, do we have a way to refresh it before building plan?

UPD
Adding KUBE_CONFIG_PATH changed errors to
Kubernetes cluster unreachable: the server has asked for the client to provide credentials

github-actions · Answer 43 · Thu Nov 10 2022 10:32:59 GMT+0800 (China Standard Time)

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.