k-mitevski / terraform-k8s

Example code for provisioning Kubernetes clusters on EKS using Terraform.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Terraform Creation failing following https://learnk8s.io/terraform-eks: Error: Kubernetes cluster unreachable: Get

marcellodesales opened this issue ยท comments

Hi there,

Thank you for the awesome tutorial at https://learnk8s.io/terraform-eks#you-can-provision-an-eks-cluster-with-terraform-too... Very useful as I was looking for an example to get different clusters per environment... I just need 2... really appreciated your work!!!

Just got an error creating the cluster using the step 6. I had updated a couple of properties shown below, but here's the error...

Error

I'm getting the following error:

module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEKS_CNI_Policy[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-2020101804515980130000000a]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEC2ContainerRegistryReadOnly[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-20201018045159789200000008]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEKSWorkerNodePolicy[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-2020101804515988710000000b]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_additional_policies[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-20201018045159794400000009]

Error: Kubernetes cluster unreachable: Get https://44C5045D2C00520DBF55914A260A17C8.
   gr7.sa-east-1.eks.amazonaws.com/version?timeout=32s: dial tcp: lookup 
   44C5045D2C00520DBF55914A260A17C8.gr7.sa-east-1.eks.amazonaws.com on 192.168.1.1:53: 
   read udp 192.168.1.35:54700->192.168.1.1:53: i/o timeout

At this point, I know I can ping amazonaws.com... But maybe we are missing a security group? The cluster got created...

Environment

$ terraform version
Terraform v0.13.4
+ provider registry.terraform.io/hashicorp/aws v3.11.0
+ provider registry.terraform.io/hashicorp/helm v1.3.1
+ provider registry.terraform.io/hashicorp/kubernetes v1.13.2
+ provider registry.terraform.io/hashicorp/local v2.0.0
+ provider registry.terraform.io/hashicorp/null v3.0.0
+ provider registry.terraform.io/hashicorp/random v3.0.0
+ provider registry.terraform.io/hashicorp/template v2.2.0

Setup

  • The UI lists the clusters

Screen Shot 2020-10-18 at 2 13 08 AM

  • I can also list them from the CMD
$ aws eks list-clusters
{
    "clusters": [
        "eks-prd-super-cash-example-com",
        "eks-ppd-super-cash-example-com"
    ]
}

Missing sep to install the authenticator

ATTENTION: The article doesn't mention the creation of the aws-iam-authenticator

  • All the kubeconfig files were created with the authenticator dependency
$ kubectl get pods --all-namespaces
Unable to connect to the server: getting credentials: exec: exec: "aws-iam-authenticator": executable file not found in $PATH

$ brew install aws-iam-authenticator
  • Just got the list of files
$ ls -la kubeconfig_eks-p*
-rw-r--r--  1 marcellodesales  staff  2056 Oct 18 01:52 kubeconfig_eks-ppd-super-cash-example-com
-rw-r--r--  1 marcellodesales  staff  2056 Oct 18 01:51 kubeconfig_eks-prd-super-cash-example-com

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                                  READY   STATUS    RESTARTS   AGE
default       ingress-aws-alb-ingress-controller-6ccd59df99-8lsvh   0/1     Pending   0          29m
kube-system   coredns-59dcf49c5-5wkkf                               0/1     Pending   0          32m
kube-system   coredns-59dcf49c5-hbqtl                               0/1     Pending   0          32m

Other changes made to the original

  • Changed Kubernetes version from 1.17 to 1.18
  • Changed the subnets to have odd and even octets per subnet type... Not sure if that would affect the access...
  private_subnets      = ["172.16.1.0/24", "172.16.3.0/24", "172.16.5.0/24"]
  public_subnets       = ["172.16.2.0/24", "172.16.4.0/24", "172.16.6.0/24"]

API server SSL certs might be wrong

  • I'm not sure if the problem is related to the certs... Even though it says unreachable, I can see that the certs are incorrect...
$ curl -v  https://DCF5F17BFF0ACDC562845DA97F3B171F.sk1.sa-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/configmaps
*   Trying 54.207.147.62...
* TCP_NODELAY set
* Connected to DCF5F17BFF0ACDC562845DA97F3B171F.sk1.sa-east-1.eks.amazonaws.com (54.207.147.62) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS alert, unknown CA (560):
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

Thank you
Marcello

t2.micro problems - 0/1 nodes are available: 1 Too many pods.

  • Creating a dev cluster I got 0/1 nodes are available: 1 Too many pods.... Even though there's an autoscale group for the cluster... Not sure the reason... I changed to t2.medium and resolved...
$ kubectl get pods
NAME                                                READY   STATUS    RESTARTS   AGE
ingress-aws-alb-ingress-controller-66f95d8d-v9n6m   0/1     Pending   0          114s

 โ˜ธ๏ธ  kubectl@1.18.6 ๐Ÿ“› kustomize@v3.8.1     ๐Ÿงพ terraform@v0.13.4
provider
โŽˆ default ๐Ÿ” eks_eks-ppd-super-cash-example-com
~/dev/github.com/k-mitevski/terraform-k8s/06_terraform_envs_customised/environments/ppd on ๎‚  master! โŒš 13:53:13
$ kubectl describe pod ingress-aws-alb-ingress-controller-66f95d8d-v9n6m
Name:           ingress-aws-alb-ingress-controller-66f95d8d-v9n6m
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app.kubernetes.io/instance=ingress
                app.kubernetes.io/name=aws-alb-ingress-controller
                pod-template-hash=66f95d8d
Annotations:    kubernetes.io/psp: eks.privileged
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/ingress-aws-alb-ingress-controller-66f95d8d
Containers:
  aws-alb-ingress-controller:
    Image:      docker.io/amazon/aws-alb-ingress-controller:v1.1.8
    Port:       10254/TCP
    Host Port:  0/TCP
    Args:
      --cluster-name=eks-ppd-super-cash-example-com
      --ingress-class=alb
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from ingress-aws-alb-ingress-controller-token-bgv6p (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  ingress-aws-alb-ingress-controller-token-bgv6p:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-aws-alb-ingress-controller-token-bgv6p
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  35s (x5 over 2m6s)  default-scheduler  0/1 nodes are available: 1 Too many pods.

Missing the install of the cluster autoscaler

Error when Re-running

  • Just got this error after re-running:
Error: error creating EKS Node Group (eks-ppd-super-cash-example-com:eks-ppd-super-cash-example-com-first-grand-primate): InvalidParameterException: Subnets are not tagged with the required tag. Please tag all subnets with Key: kubernetes.io/cluster/eks-ppd-super-cash-example-com Value: shared
{
  RespMetadata: {
    StatusCode: 400,
    RequestID: "249ff5ae-e506-40aa-a56f-ecc3441e856e"
  },
  ClusterName: "eks-ppd-super-cash-example-com",
  Message_: "Subnets are not tagged with the required tag. Please tag all subnets with Key: kubernetes.io/cluster/eks-ppd-super-cash-example-com Value: shared",
  NodegroupName: "eks-ppd-super-cash-example-com-first-grand-primate"
}
  • I noticed that the subgroups are not prefixed with eks, which is the name of the cluster...

FROM

  public_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                    = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"           = "1"
  }

TO

  public_subnet_tags = {
    "kubernetes.io/cluster/eks-${local.env_domain}" = "shared"
    "kubernetes.io/role/elb"                        = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/eks-${local.env_domain}" = "shared"
    "kubernetes.io/role/internal-elb"               = "1"
  }