HCL-TECH-SOFTWARE / connections-automation

Deployment and upgrade automation scripts for HCL Connections 7.0 based on Ansible

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

helm not installed

MagnaFrisia90 opened this issue · comments

Due to all the problems with python2 on CentOS7 I have moved my ansible controller to AlmaLinux8 which only has python3.
I have tried to setup "component-pack-infra-only" which runs into the following error:

TASK [component-pack : Check the vars] ***********************************************************************************************************
included: /opt/connections-automation/roles/hcl/component-pack/tasks/check_vars.yml for solcon15.solvito.cloud

TASK [component-pack : Check if Helm is installed] ***********************************************************************************************
fatal: [solcon15.solvito.cloud]: FAILED! => {"changed": true, "cmd": "helm version", "delta": "0:00:00.004520", "end": "2023-02-15 13:10:29.647743", "msg": "non-zero return code", "rc": 127, "start": "2023-02-15 13:10:29.643223", "stderr": "/bin/sh: helm: command not found", "stderr_lines": ["/bin/sh: helm: command not found"], "stdout": "", "stdout_lines": []}
...ignoring

TASK [component-pack : Fail if Helm is not installed] ********************************************************************************************
fatal: [solcon15.solvito.cloud]: FAILED! => {"changed": false, "msg": "Can not find Helm installed for the user running this script, failing here."}

Has Helm to be installed manually ? This is not listed under requirements.
Also I did not experience issue when running from my CentOS7-ansible on the same target host "solcon15.solvito.cloud" - hopefully this is a good sign.

I have installed helm manually now on "solcon15.solvito.cloud" but the script prints the same error.

To get this running on a system which is only binary compatible to RHEL, you have to search for:

ansible_distribution == "RedHat" and replace with ansible_family == "RedHat"

I'm preparing a PR for this, but until this is merged, please replace on your own.

ansible_family or ansible_os_family ?
I have tried with ansible_family but same situation.
But your reply means helm is not required to be installed manually ?

Sorry ansible_os_family,
helm is installed in roles/third_party/helm-install/tasks/install_helm.yml

Is manual installation required ? e.g. by running ansible-playbook -i environments/examples/cnx7/quick_start/inventory.ini roles/third_party/helm-install/tasks/install_helm.yml
"ansible_os_family" also prints the helm related error

Helm install is part of setup-kubernetes.yml which is part of setup-component-pack-infra-only.yml

- name: Setup Helm
hosts: k8s_admin
become: true
any_errors_fatal: true
roles:
- roles/third_party/helm-install

Attached is a sample output of running setup-helm.yml on AlmaLinux for comparison:
setup-helm.log

[ansible@alma ~]$ helm version
version.BuildInfo{Version:"v3.10.2", GitCommit:"50f003e5ee8704ec937a756c646870227d7c8b58", GitTreeState:"clean", GoVersion:"go1.18.8"}

You could also just update the /etc/redhat-release ... it needs to start with RedHat ...

My setup does not get the the task/step "setup helm". Any idea how to fix this ?

TASK [setup-master-node : Initialize master for single node installation] ************************************************************************
fatal: [solcon15.solvito.cloud]: FAILED! => {"changed": true, "cmd": ["kubeadm", "init", "--config=/tmp/k8s_ansible/kubeadm-config.yaml"], "delta": "0:00:00.403709", "end": "2023-02-16 12:56:43.538173", "msg": "non-zero return code", "rc": 1, "start": "2023-02-16 12:56:43.134464", "stderr": "W0216 12:56:43.173410 9318 common.go:84] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta2". Please use 'kubeadm config migrate --old-config old.yaml --new-config new.yaml', which will write the new, similar spec using a newer API version.\nerror execution phase preflight: [preflight] Some fatal errors occurred:\n\t[ERROR CRI]: container runtime is not running: output: time="2023-02-16T12:56:43+01:00" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"\n, error: exit status 1\n[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["W0216 12:56:43.173410 9318 common.go:84] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta2". Please use 'kubeadm config migrate --old-config old.yaml --new-config new.yaml', which will write the new, similar spec using a newer API version.", "error execution phase preflight: [preflight] Some fatal errors occurred:", "\t[ERROR CRI]: container runtime is not running: output: time="2023-02-16T12:56:43+01:00" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"", ", error: exit status 1", "[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "[init] Using Kubernetes version: v1.25.1\n[preflight] Running pre-flight checks", "stdout_lines": ["[init] Using Kubernetes version: v1.25.1", "[preflight] Running pre-flight checks"]}

It is pretty similar to #254 but I have not changed any default values as @sabrina-yee said
"__kubernetes_version: "{{ kubernetes_version | default('1.25.1') }}" is still on default value in main.yml

And this is the result when I switch to "kubernetes version 1.21.7"

TASK [setup-master-node : Initialize master for single node installation] ******
fatal: [solcon15.solvito.cloud]: FAILED! => {"changed": true, "cmd": ["kubeadm","init", "--config=/tmp/k8s_ansible/kubeadm-config.yaml"], "delta": "0:00:00.171868", "end": "2023-02-16 13:16:08.159293", "msg": "non-zero return code", "rc": 1, "start": "2023-02-16 13:16:07.987425", "stderr": "\t[WARNING KubernetesVersion]: Kubernetes version is greater than kubeadm version. Please consider to upgrade kubeadm. Kubernetes version: 1.25.1. Kubeadm version: 1.21.x\nerror execution phase preflight: [preflight] Some fatal errors occurred:\n\t[ERROR CRI]: container runtime is not running: output: time="2023-02-16T13:16:08+01:00" al msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"\n, error: exit status 1\n[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["\t[WARNING KubernetesVersion]: Kubernetes version is greater than kubeadm version. Please consider to upgrade m. Kubernetes version: 1.25.1. Kubeadm version: 1.21.x", "error execution phase preflight: [preflight] Some fatal errors occurred:", "\t[ERROR CRI]: container runtime is not running: output: time="2023-02-16T13:16:08+01:00" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"", ", error: exit status 1", "[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "[init] Using Kubernetes version: v1.25.1\n[preflight] Running pre-flight checks", "stdout_lines": ["[init] Using Kubernetes version: v1.25.1", "[preflight] Running pre-flight checks"]}

The [ERROR CRI]: container runtime is not running seems to suggest containerd is not running. Could you check by running sudo systemctl status containerd.

Status should be active similar to the following:

containerd.service - containerd container runtime
   Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2023-02-16 16:21:00 UTC; 42min ago

If it's not active, my suggestion would be to debug the service by inspecting sudo journalctl -u containerd -f.

Python3 seems to have solved most issues. Regardless of Alma8 or CentOS7. Helm is installed partly now.
I am now stuck for CentOS7 (did not try Alma8 now) at the following step :

issue appears now during "ansible-playbook -i environments/examples/cnx7/quick_start/inventory.ini playbooks/setup-component-pack-infra-only.yml"

TASK [helm-install : Check if tiller is already initialized] *********************************************************************************************************
fatal: [solcon15.solvito.cloud]: FAILED! => {"changed": true, "cmd": "kubectl get pods -n kube-system | grep tiller", "delta": "0:00:00.140528", "end": "2023-02-20 19:40:25.507126", "msg": "non-zero return code", "rc": 1, "start": "2023-02-20 19:40:25.366598", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring

TASK [helm-install : Init Helm and create tiller service account] ****************************************************************************************************
skipping: [solcon15.solvito.cloud]

So basically "ansible-playbook -i environments/examples/cnx7/quick_start/inventory.ini playbooks/setup-component-pack-infra-only.yml" now leads to a working Kubernetes envoirnment without a Docker registry.

When I start from scratch with "ansible-playbook -i environments/examples/cnx7/quick_start/inventory.ini playbooks/hcl/harbor/setup-component-pack-only.yml" it will immediatly print the helm missing issue.

Was the problem from this part #266 (comment) of the thread ever resolved? Could you check #266 (comment) ?

After changing to python3 the issue related to #266 does not appear anymore. Now still some helm-issues left as described in prior comment

So basically "ansible-playbook -i environments/examples/cnx7/quick_start/inventory.ini playbooks/setup-component-pack-infra-only.yml" now leads to a working Kubernetes envoirnment without a Docker registry.

Yes, it is per design, Component Pack doesn't need a Docker registry anymore.

When I start from scratch with "ansible-playbook -i environments/examples/cnx7/quick_start/inventory.ini playbooks/hcl/harbor/setup-component-pack-only.yml" it will immediatly print the helm missing issue.

Could you please confirm the Helm install part of setup-component-pack-infra-only.yml is comparable to the sample output I posted in #266 (comment) ? You can run the playbooks/third_party/setup-helm.yml again if the prior output is not available.

Please note that Tiller is for the old Helm v2 which is expected to be skipped, as you may observed the same happened in my sample output. So the problem appears to be some other step in the process.

@sabrina-yee Could you please clarify the installation path for v7 Component Pack vs v8 Component Pack?
I want to install a component pack that works with v7. I think you are now directing towards a v8 Component Pack deployment, right ?
I think if you could just update the documentation in regards to Python3 & clearly describe both ways (v7 with Docker registry vs v8 with Helm automatically pulling images) that would help the most for anybody & clarify the situation here.

I think you are now directing towards a v8 Component Pack deployment, right ?

That's correct, as @nitinjagjivan mentioned in #253 (comment) , installing using docker scripts/registry is no longer maintained in this repo and the latest Component Pack images are stored in the HCL Harbor repository. This Component Pack can be installed on v7 Connections.

I still have troubles with accesing the harbor-helm-repo with my HCL ID credentials but I will report this in another topic.
The technical setup helm & kubernetes related to this "case" are solved now. I have now used the following steps without any issues using Python3 on CentOS7.9 :

#inventory.ini [nfs_servers]should only contain the k8s-hosts

ansible-playbook -i environments/examples/cnx8/db2/inventory.ini playbooks/third_party/setup-nginx.yml
ansible-playbook -i environments/examples/cnx8/db2/inventory.ini playbooks/third_party/setup-haproxy.yml
ansible-playbook -i environments/examples/cnx8/db2/inventory.ini playbooks/third_party/setup-nfs.yml
ansible-playbook -i environments/examples/cnx8/db2/inventory.ini playbooks/third_party/setup-containerd.yml
ansible-playbook -i environments/examples/cnx8/db2/inventory.ini playbooks/third_party/kubernetes/setup-kubernetes.yml

#and to enable root to do the kubectl cmds:
cp -R /home/ansible/.kube/ ~/