sonata-nfv / tng-industrial-pilot

5GTANGO Smart Manufacturing Pilot

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix crashing CC processor

stefanbschneider opened this issue · comments

When deploying the CC processor container through Kubernetes it repeatedly crashes. We should figure out why and fix it.

$ kubectl get pods
NAME                                           READY   STATUS    RESTARTS   AGE
ns1-cc-broker-deployment-867484cd6b-s8f7k      1/1     Running   0          8m5s
ns1-cc-processor-deployment-5d64659ff8-b7vnt   1/1     Running   3          8m5s
ns1-eae-deployment-787c944c87-8rcp5            1/1     Running   0          8m4s
ns2-dt-deployment-6f8486884-4x7cw              1/1     Running   0          8m1s
ns2-mdc-deployment-655dfddc7b-jxz6w            1/1     Running   0          8m1s

The logs are not very useful:

tango@fgcn-tango-k8s-2:~$ kubectl logs -f ns1-cc-processor-deployment-5d64659ff8-b7vnt
CC-CDU02 (processor): Starting Azure Cloud Connector ...

Any ideas? Maybe start by adjusting the print statements to use print(..., flush=True) to ensure output is flushed and shown in the logs for debugging?

Later this week, I'll try to refactor the CC processor to dynamically configure the MQTT host name using environmental variables. Currently, the MQTT host is hard coded to 127.0.0.1, which will remain as default value.

In k8s, each CDU is deployed as a separate pod, which can only talk to each other through separate services. So for the CC processor to connect to the CC broker, it needs to know its service name. In the manual deployment, we can fix this service name, but when we start deploying the NS through the service platform, then each pod, deployment, and service will get a dynamic name vendor-name-version-uuid.

The SP will then need to tell all CDUs what the service UUID is (as environmental variable) such that, e.g., the CC processor knows how to connect to the CC broker.

When refactoring the CC processor, I'll also have a look and try to see if I can find the reason why it repeatedly crashes.

Mistake on my side: The MQTT host is actually already configurable via an environmental variable MQTT_BROKER_HOST which is set in the Dockerfile. Se we just have to rename this later to the vendor.name.version.cdu_id schema (see here), when integrating with the SP.

We still need to debug the crashing CC processor container.

Ok, the CC processor simply crashed because the host of the MQTT broker wasn't set correctly. Depending on what was set as host (localhost or some actual IP), it took shorter or longer until for ConnectionRefusedError: [Errno 111] Connection refused to occurr.

When passing the correct MQTT host as env variable, it seems to run stable. Probably should do a try, except at some point and print better error output, but this isn't an issue now.

There's a new issue with the CC processor:

$ kubectl get pods
NAME                                           READY   STATUS             RESTARTS   AGE
ns1-cc-broker-deployment-867484cd6b-8gfjp      1/1     Running            0          2m36s
ns1-cc-processor-deployment-7d9ddc9599-cm5nx   0/1     CrashLoopBackOff   4          2m36s
ns1-eae-deployment-787c944c87-5t5j8            1/1     Running            0          2m36s
ns2-mdc-deployment-5544c6589-278fb             1/1     Running            0          2m32s
$ kubectl logs -f ns1-cc-processor-deployment-7d9ddc9599-cm5nx
CC-CDU02 (processor): Starting Azure Cloud Connector ...
---------------------------------------------
No SAS token available.
IoT Hub name is missing. Check JSON file.

I'm looking at it now