Terraform Creation failing following https://learnk8s.io/terraform-eks: Error: Kubernetes cluster unreachable: Get
marcellodesales opened this issue ยท comments
Hi there,
Thank you for the awesome tutorial at https://learnk8s.io/terraform-eks#you-can-provision-an-eks-cluster-with-terraform-too... Very useful as I was looking for an example to get different clusters per environment... I just need 2... really appreciated your work!!!
Just got an error creating the cluster using the step 6. I had updated a couple of properties shown below, but here's the error...
Error
I'm getting the following error:
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEKS_CNI_Policy[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-2020101804515980130000000a]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEC2ContainerRegistryReadOnly[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-20201018045159789200000008]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_AmazonEKSWorkerNodePolicy[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-2020101804515988710000000b]
module.prd_cluster.module.eks.aws_iam_role_policy_attachment.workers_additional_policies[0]: Refreshing state... [id=eks-prd-super-cash-example-com20201018045153077400000007-20201018045159794400000009]
Error: Kubernetes cluster unreachable: Get https://44C5045D2C00520DBF55914A260A17C8.
gr7.sa-east-1.eks.amazonaws.com/version?timeout=32s: dial tcp: lookup
44C5045D2C00520DBF55914A260A17C8.gr7.sa-east-1.eks.amazonaws.com on 192.168.1.1:53:
read udp 192.168.1.35:54700->192.168.1.1:53: i/o timeout
At this point, I know I can ping amazonaws.com... But maybe we are missing a security group? The cluster got created...
Environment
$ terraform version
Terraform v0.13.4
+ provider registry.terraform.io/hashicorp/aws v3.11.0
+ provider registry.terraform.io/hashicorp/helm v1.3.1
+ provider registry.terraform.io/hashicorp/kubernetes v1.13.2
+ provider registry.terraform.io/hashicorp/local v2.0.0
+ provider registry.terraform.io/hashicorp/null v3.0.0
+ provider registry.terraform.io/hashicorp/random v3.0.0
+ provider registry.terraform.io/hashicorp/template v2.2.0
Setup
- The UI lists the clusters
- I can also list them from the CMD
$ aws eks list-clusters
{
"clusters": [
"eks-prd-super-cash-example-com",
"eks-ppd-super-cash-example-com"
]
}
Missing sep to install the authenticator
ATTENTION: The article doesn't mention the creation of the
aws-iam-authenticator
- All the kubeconfig files were created with the authenticator dependency
$ kubectl get pods --all-namespaces
Unable to connect to the server: getting credentials: exec: exec: "aws-iam-authenticator": executable file not found in $PATH
$ brew install aws-iam-authenticator
- Just got the list of files
$ ls -la kubeconfig_eks-p*
-rw-r--r-- 1 marcellodesales staff 2056 Oct 18 01:52 kubeconfig_eks-ppd-super-cash-example-com
-rw-r--r-- 1 marcellodesales staff 2056 Oct 18 01:51 kubeconfig_eks-prd-super-cash-example-com
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default ingress-aws-alb-ingress-controller-6ccd59df99-8lsvh 0/1 Pending 0 29m
kube-system coredns-59dcf49c5-5wkkf 0/1 Pending 0 32m
kube-system coredns-59dcf49c5-hbqtl 0/1 Pending 0 32m
Other changes made to the original
- Changed Kubernetes version from
1.17
to1.18
- Changed the subnets to have odd and even octets per subnet type... Not sure if that would affect the access...
private_subnets = ["172.16.1.0/24", "172.16.3.0/24", "172.16.5.0/24"]
public_subnets = ["172.16.2.0/24", "172.16.4.0/24", "172.16.6.0/24"]
API server SSL certs might be wrong
- I'm not sure if the problem is related to the certs... Even though it says
unreachable
, I can see that the certs are incorrect...
$ curl -v https://DCF5F17BFF0ACDC562845DA97F3B171F.sk1.sa-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/configmaps
* Trying 54.207.147.62...
* TCP_NODELAY set
* Connected to DCF5F17BFF0ACDC562845DA97F3B171F.sk1.sa-east-1.eks.amazonaws.com (54.207.147.62) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS alert, unknown CA (560):
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
Thank you
Marcello
t2.micro
problems - 0/1 nodes are available: 1 Too many pods.
- Creating a dev cluster I got
0/1 nodes are available: 1 Too many pods.
... Even though there's an autoscale group for the cluster... Not sure the reason... I changed tot2.medium
and resolved...
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
ingress-aws-alb-ingress-controller-66f95d8d-v9n6m 0/1 Pending 0 114s
โธ๏ธ kubectl@1.18.6 ๐ kustomize@v3.8.1 ๐งพ terraform@v0.13.4
provider
โ default ๐ eks_eks-ppd-super-cash-example-com
~/dev/github.com/k-mitevski/terraform-k8s/06_terraform_envs_customised/environments/ppd on ๎ master! โ 13:53:13
$ kubectl describe pod ingress-aws-alb-ingress-controller-66f95d8d-v9n6m
Name: ingress-aws-alb-ingress-controller-66f95d8d-v9n6m
Namespace: default
Priority: 0
Node: <none>
Labels: app.kubernetes.io/instance=ingress
app.kubernetes.io/name=aws-alb-ingress-controller
pod-template-hash=66f95d8d
Annotations: kubernetes.io/psp: eks.privileged
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/ingress-aws-alb-ingress-controller-66f95d8d
Containers:
aws-alb-ingress-controller:
Image: docker.io/amazon/aws-alb-ingress-controller:v1.1.8
Port: 10254/TCP
Host Port: 0/TCP
Args:
--cluster-name=eks-ppd-super-cash-example-com
--ingress-class=alb
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from ingress-aws-alb-ingress-controller-token-bgv6p (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
ingress-aws-alb-ingress-controller-token-bgv6p:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-aws-alb-ingress-controller-token-bgv6p
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 35s (x5 over 2m6s) default-scheduler 0/1 nodes are available: 1 Too many pods.
Missing the install of the cluster autoscaler
- Missing the auto-scaler, so the dev environment does not work since new hosts can't be created
- EKS requires 4 pods
- Micro profile only allows 4 pods (https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt)
- Requires 2 pieces:
- AIM roles
- Helm chart for autoscaler
Error when Re-running
- Just got this error after re-running:
Error: error creating EKS Node Group (eks-ppd-super-cash-example-com:eks-ppd-super-cash-example-com-first-grand-primate): InvalidParameterException: Subnets are not tagged with the required tag. Please tag all subnets with Key: kubernetes.io/cluster/eks-ppd-super-cash-example-com Value: shared
{
RespMetadata: {
StatusCode: 400,
RequestID: "249ff5ae-e506-40aa-a56f-ecc3441e856e"
},
ClusterName: "eks-ppd-super-cash-example-com",
Message_: "Subnets are not tagged with the required tag. Please tag all subnets with Key: kubernetes.io/cluster/eks-ppd-super-cash-example-com Value: shared",
NodegroupName: "eks-ppd-super-cash-example-com-first-grand-primate"
}
- I noticed that the subgroups are not prefixed with
eks
, which is the name of the cluster...
FROM
public_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/elb" = "1"
}
private_subnet_tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = "1"
}
TO
public_subnet_tags = {
"kubernetes.io/cluster/eks-${local.env_domain}" = "shared"
"kubernetes.io/role/elb" = "1"
}
private_subnet_tags = {
"kubernetes.io/cluster/eks-${local.env_domain}" = "shared"
"kubernetes.io/role/internal-elb" = "1"
}