Add config category failed connection refused
stephenrichardson opened this issue · comments
I've been using Fledge for the past few weeks and have written my own south python plugin to get sensor data. This has been working and I can see readings. Today I setup a north plugin for Azure IOT Hub. This also worked and I could see messages arriving into Azure.
At some point (and I'm not sure what caused the issues) Fledge errored and stopped.
I deleted the north and south services and performed a fledge reset
.
I am now at a point where I can start fledge and services show as running. I was able to recreate the North plugin to Azure (and I see messages arriving in Azure each time Fledge starts). I was able to recreate the South plugin for the custom plugin I created but it would go to unresponsive and eventually show as failed.
I've deleted my custom plugin services and the plugin files from the fledge directory. If I add the http_south plugin to the fledge directory, as soon as I go to browse the south plugins to create a service, fledge will error and stop immediately.
This is the error that is appearing most consistently:
May 7 20:44:33 firefly Fledge Storage[28515]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
These are some errors that are also occurring (I suspect due to the one above):
May 7 21:55:22 firefly Fledge Storage[21315]: FATAL: (1) 1 0x7f9c0566c0 __kernel_rt_sigreturn + 0---------
May 7 21:55:22 firefly Fledge Storage[21315]: FATAL: (0) 0 0x558b3b158c /usr/local/fledge/services/fledge.services.storage(+0x5958c) [0x558b3b158c]---------
May 7 21:55:22 firefly Fledge Storage[21315]: FATAL: Signal 6 (Aborted) trapped:
May 7 21:55:22 firefly Fledge Storage[21315]: ERROR: Add child categories failed Connection refused.
May 7 21:55:22 firefly Fledge Storage[21315]: ERROR: Add config category failed Connection refused.
May 7 21:55:04 firefly Fledge Storage[21315]: ERROR: Add config category failed Connection refused.
Since posting this, I reinstalled Fledge. I can see these errors in the log:
May 8 10:02:05 firefly Fledge Storage[15774]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
May 8 10:02:05 firefly Fledge Storage[15774]: ERROR: HTTP error while fetching configuration category for Storage: 404: No such Category found for Storage
May 8 10:02:05 firefly Fledge[15671] INFO: service_registry: fledge.services.core.service_registry.service_registry: Registered service instance id=bb4af805-7da5-475c-a6f7-8fee9e9626e8:
I reinstalled my custom plugin and created a south service. The service shows unresponsive
and goes into a cycle of trying to start. These are the log messages each time:
May 8 10:19:23 firefly Fledge[15671] INFO: service_registry: fledge.services.core.service_registry.service_registry: Mark as failed service instance id=8e72e7b2-55a4-41d3-a089-1366031bbb17:
May 8 10:18:06 firefly Fledge BluVib203610[18154]: FATAL: (4) 4 0x7f8a339a34 /lib/ld-linux-aarch64.so.1(+0xda34) [0x7f8a339a34]---------
May 8 10:18:06 firefly Fledge BluVib203610[18154]: FATAL: (3) 3 0x7f69cdb72c gotoblas_init + 52---------
May 8 10:18:06 firefly Fledge BluVib203610[18154]: FATAL: (2) 2 0x7f69e58f54 gotoblas_dynamic_init + 500---------
May 8 10:18:06 firefly Fledge BluVib203610[18154]: FATAL: (1) 1 0x7f8a3586c0 __kernel_rt_sigreturn + 0---------
May 8 10:18:06 firefly Fledge BluVib203610[18154]: FATAL: (0) 0 0x5565a16e34 handler(int) + 76---------
May 8 10:18:06 firefly Fledge BluVib203610[18154]: FATAL: Signal 4 (Illegal instruction) trapped:
May 8 10:18:04 firefly Fledge[15671] INFO: service_registry: fledge.services.core.service_registry.service_registry: Registered service instance id=8e72e7b2-55a4-41d3-a089-1366031bbb17:
I deleted the north and south services and performed a fledge reset.
You don't necessarily delete manually stuff if fledge reset
is used as it will take care automatically and instance is resetted with default configuration, but yes it will not reset the plugins directory!
This is the error that is appearing most consistently:
May 7 20:44:33 firefly Fledge Storage[28515]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
This is known issue to us and error can be ignored as there were some race condition which needs to be handled gracefully but it will not impact any of the area.
I reinstalled Fledge.
Have you installed Fledge with make based or package based installation and on which platform architecture?
The service shows unresponsive and goes into a cycle of trying to start
As per logs seems like it's an issue with your plugin. Is it python based or C-based plugin?
Is it possible to share the support bundle of your instance? See how to get this here
I deleted the north and south services and performed a fledge reset.
I did this because the fledge service wouldn't startup again and doing this meant it could start.
Have you installed Fledge with make based or package based installation and on which platform architecture?
Make based: aarch64 Ubuntu 18.04
As per logs seems like it's an issue with your plugin. Is it python based or C-based plugin?
The plugin I wrote was a python one. But if I remove this one completely and just install the http_south one, as soon as I go to the "south" page on the Fledge GUI, Fledge will crash/stop.
Some system logs:
May 9 15:26:27 firefly Fledge Storage[23911]: FATAL: (2) 2 0x7f95c254f8 raise + 176---------
May 9 15:26:27 firefly Fledge Storage[23911]: FATAL: (1) 1 0x7f9638b6c0 __kernel_rt_sigreturn + 0---------
May 9 15:26:27 firefly Fledge Storage[23911]: FATAL: (0) 0 0x55856e30a4 /usr/local/fledge/services/fledge.services.storage(+0x590a4) [0x55856e30a4]---------
May 9 15:26:27 firefly Fledge Storage[23911]: FATAL: Signal 6 (Aborted) trapped:
May 9 15:26:27 firefly Fledge Storage[23911]: ERROR: Add child categories failed Connection refused.
May 9 15:26:27 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:26:09 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:25:53 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:25:39 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:25:27 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:25:26 firefly Fledge [24035] INFO: script.fledge: Fledge started.
May 9 15:25:25 firefly Fledge [24389] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 9 15:25:25 firefly Fledge [24381] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 9 15:25:25 firefly Fledge [24369] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 9 15:25:25 firefly Fledge[24130] INFO: service_registry: fledge.services.core.service_registry.service_registry: Registered service instance id=b747e41d-af66-4593-b7de-0cb304cf959a:
May 9 15:25:23 firefly Fledge[24130] INFO: server: fledge.services.core.server: REST API Server started on http://0.0.0.0:8081
May 9 15:25:23 firefly Fledge[24130] INFO: server: fledge.services.core.server: PID [24130] written in [/usr/local/fledge/data/var/run/fledge.core.pid]
May 9 15:25:23 firefly Fledge[24130] WARNING: server: fledge.services.core.server: A Fledge PID file has been found: [/usr/local/fledge/data/var/run/fledge.core.pid] found, ignoring it.
May 9 15:25:22 firefly Fledge[24130] INFO: server: fledge.services.core.server: Announce management API service
May 9 15:25:22 firefly Fledge[24130] INFO: server: fledge.services.core.server: Services monitoring started ...
May 9 15:25:22 firefly Fledge[24130] INFO: server: fledge.services.core.server: Starting scheduler ...
May 9 15:25:20 firefly Fledge Storage[24228]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
May 9 15:25:18 firefly Fledge Storage[24228]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
May 9 15:25:18 firefly Fledge[24130] INFO: service_registry: fledge.services.core.service_registry.service_registry: Registered service instance id=c2a44a61-9156-4ad2-81c5-225a0fa213fc:
May 9 15:25:17 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:25:17 firefly Fledge [24206] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 9 15:25:17 firefly Fledge [24206] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 9 15:25:17 firefly Fledge[24130] INFO: server: fledge.services.core.server: Start storage, from directory /usr/local/fledge/scripts
May 9 15:25:17 firefly Fledge[24130] INFO: server: fledge.services.core.server: Management API started on http://0.0.0.0:40597
May 9 15:25:17 firefly Fledge[24130] INFO: server: fledge.services.core.server: Starting ...
May 9 15:25:14 firefly Fledge Storage[24035] INFO: script.plugin.storage.sqlite: Fledge DB schema is up to date to version [70]
May 9 15:25:14 firefly Fledge Storage[24035] INFO: script.plugin.storage.sqlite: SQLite3 readings database is ready.
May 9 15:25:14 firefly Fledge Storage[24109] INFO: script.plugin.storage.sqlite: SQLite 3 database '/usr/local/fledge/data/readings_1.db' ready.
May 9 15:25:14 firefly Fledge Storage[24035] INFO: script.plugin.storage.sqlite: SQLite3 database is ready.
May 9 15:25:14 firefly Fledge Storage[24106] INFO: script.plugin.storage.sqlite: SQLite 3 database '/usr/local/fledge/data/fledge.db' ready.
May 9 15:25:14 firefly Fledge [24081] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 9 15:25:14 firefly Fledge [24068] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 9 15:25:09 firefly Fledge message repeated 2 times: [ Storage[23911]: ERROR: Add config category failed Connection refused.]
May 9 15:24:59 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:24:58 firefly Fledge [23718] ERROR: script.fledge: Fledge cannot start.
May 9 15:24:57 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
May 9 15:24:56 firefly Fledge[23813] INFO: service_registry: fledge.services.core.service_registry.service_registry: Registered service instance id=f259c247-d0f4-43ad-9df8-eb94bf11242b:
May 9 15:24:54 firefly Fledge[23813] INFO: server: fledge.services.core.server: REST API Server started on http://0.0.0.0:8081
May 9 15:24:54 firefly Fledge[23813] INFO: server: fledge.services.core.server: PID [23813] written in [/usr/local/fledge/data/var/run/fledge.core.pid]
May 9 15:24:54 firefly Fledge[23813] WARNING: server: fledge.services.core.server: A Fledge PID file has been found: [/usr/local/fledge/data/var/run/fledge.core.pid] found, ignoring it.
May 9 15:24:54 firefly Fledge[23813] INFO: server: fledge.services.core.server: Announce management API service
May 9 15:24:53 firefly Fledge[23813] INFO: server: fledge.services.core.server: Services monitoring started ...
May 9 15:24:53 firefly Fledge[23813] INFO: server: fledge.services.core.server: Starting scheduler ...
May 9 15:24:51 firefly Fledge Storage[23911]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
May 9 15:24:49 firefly Fledge Storage[23911]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
Seems like your instance is in bad state as Fledge cannot start appears in log. Would you mind to run commands in order?
a) Kill all the fledge services - $FLEDGE_ROOT/bin/fledge kill
b) Reset fledge - echo "YES" | $FLEDGE_ROOT/bin/fledge reset
c) Start fledge - $FLEDGE_ROOT/bin/fledge start
d) Now add service with your plugin and see if this works else we need the support bundle of your instance along python plugin code
The problem happens when I install the http_south
plugin. When I go to south
and then click Add
, Fledge crashes.
These are the logs:
May 13 20:47:02 firefly Fledge Storage[3497]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
May 13 20:47:00 firefly Fledge Storage[3497]: ERROR: Failed to register configuration category: {"error": {"message": "[AttributeError] 'NoneType' object has no attribute 'register'"}}.
May 13 20:47:00 firefly Fledge[3399] INFO: service_registry: fledge.services.core.service_registry.service_registry: Registered service instance id=601bd37e-f806-47ad-a779-3fa815d23177:
May 13 20:46:59 firefly Fledge [3475] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 13 20:46:59 firefly Fledge [3475] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 13 20:46:59 firefly Fledge[3399] INFO: server: fledge.services.core.server: Start storage, from directory /usr/local/fledge/scripts
May 13 20:46:59 firefly Fledge[3399] INFO: server: fledge.services.core.server: Management API started on http://0.0.0.0:38992
May 13 20:46:59 firefly Fledge[3399] INFO: server: fledge.services.core.server: Starting ...
May 13 20:46:57 firefly Fledge Storage[3304] INFO: script.plugin.storage.sqlite: Fledge DB schema is up to date to version [70]
May 13 20:46:57 firefly Fledge Storage[3304] INFO: script.plugin.storage.sqlite: SQLite3 readings database is ready.
May 13 20:46:57 firefly Fledge Storage[3378] INFO: script.plugin.storage.sqlite: SQLite 3 database '/usr/local/fledge/data/readings_1.db' ready.
May 13 20:46:57 firefly Fledge Storage[3304] INFO: script.plugin.storage.sqlite: SQLite3 database is ready.
May 13 20:46:56 firefly Fledge Storage[3375] INFO: script.plugin.storage.sqlite: SQLite 3 database '/usr/local/fledge/data/fledge.db' ready.
May 13 20:46:56 firefly Fledge [3350] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 13 20:46:56 firefly Fledge [3337] INFO: scripts.services.storage: Fledge storage microservice found in FLEDGE_ROOT location: /usr/local/fledge
May 13 20:44:43 firefly Fledge [2384] INFO: script.fledge: Fledge started.
I started fledge again and went to support bundle and clicked Request New
but fledge crashed again. So I deleted the http_south
plugin and started fledge again. This time it successfully generated the support bundle (attached).
support-240513-20-50-14.tar.gz
The problem happens when I install the http_south plugin. When I go to south and then click Add, Fledge crashes.
It has nothing to do with http_south plugin.
This time it successfully generated the support bundle (attached).
I see in your instance you have multiple storage service running that's why you see the crash and with related logs like
May 9 15:24:57 firefly Fledge Storage[23911]: ERROR: Add config category failed Connection refused.
I suspect your fledge stop
was not completed in before attempts. See the below ouput for multiple storage services are running...
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND”,
firefly 2580 0.3 0.3 332752 13472 ? Ssl 20:44 0:01 /usr/local/fledge/services/fledge.services.storage --address=0.0.0.0 --port=34034",
firefly 3497 0.3 0.3 332708 13596 ? Ssl 20:46 0:00 /usr/local/fledge/services/fledge.services.storage --address=0.0.0.0 --port=38992”,
firefly 4717 16.4 0.9 2700372 38648 pts/2 Sl 20:49 0:03 python3 -m fledge.services.core”,
firefly 4815 1.9 0.3 332708 13668 ? Ssl 20:49 0:00 /usr/local/fledge/services/fledge.services.storage --address=0.0.0.0 --port=35017”
With the given support bundle only 2 processes should be running one is core
and another is storage
. Therefore with your last run of fledge only 4717, 4815
PID should exist.
So, to clean your instance either delete 2580, 3497
PID manually or use fledge kill
command it will automatically kill all the fledge processes exists in the environment.
As suggested in previous comments #1353 (comment)