managermodule fails to start when target-runtime-symphony-agent fails to access SYMPHONY_URL
BaoxiJia opened this issue · comments
I created a P4E deployment on a VM which was on a different network from P4E API. The managermodule edge module fails to start because P4E agent running on the VM couldn't access the P4E API in K8S.
Expected result: managermodule edge module outputs the error information and keeps running instead of stopping to run.
My VM IP address is 192.168.66.102, and it cannot access 192.168.0.4.
root@apd-00155d5a0d06 [ ~ ]# iotedge logs instance-d96af6dc-219b-44c5-939f-17c21b8ece44-managermodule
["skill-39657d9c-a1ae-46e9-b8f0-64aad53c13e1 as skill-1d63"]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
yield
File "/usr/local/lib/python3.10/site-packages/httpcore/backends/sync.py", line 26, in read
return self._sock.recv(max_bytes)
TimeoutError: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
yield
File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 218, in handle_request
resp = self._pool.handle_request(req)
File "/usr/local/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 253, in handle_request
raise exc
File "/usr/local/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 237, in handle_request
response = connection.handle_request(request)
File "/usr/local/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 90, in handle_request
return self._connection.handle_request(request)
File "/usr/local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 105, in handle_request
raise exc
File "/usr/local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 84, in handle_request
) = self._receive_response_headers(**kwargs)
File "/usr/local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 148, in _receive_response_headers
event = self._receive_event(timeout=timeout)
File "/usr/local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 177, in _receive_event
data = self._network_stream.read(
File "/usr/local/lib/python3.10/site-packages/httpcore/backends/sync.py", line 24, in read
with map_exceptions(exc_map):
File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
raise to_exc(exc)
httpcore.ReadTimeout: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/main.py", line 181, in <module>
instance = client.get_instance(instance_name)
File "/app/common/symphony_agent_client.py", line 60, in get_instance
return self._get('Instance', name)
File "/app/common/symphony_agent_client.py", line 45, in _get
r = httpx.get(self.url, params=params)
File "/usr/local/lib/python3.10/site-packages/httpx/_api.py", line 189, in get
return request(
File "/usr/local/lib/python3.10/site-packages/httpx/_api.py", line 100, in request
return client.request(
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 815, in request
return self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 902, in send
response = self._send_handling_auth(
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 930, in _send_handling_auth
response = self._send_handling_redirects(
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 967, in _send_handling_redirects
response = self._send_single_request(request)
File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1003, in _send_single_request
response = transport.handle_request(request)
File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 217, in handle_request
with map_httpcore_exceptions():
File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout: timed out
["skill-39657d9c-a1ae-46e9-b8f0-64aad53c13e1 as skill-1d63"]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
yield
File "/usr/local/lib/python3.10/site-packages/httpcore/backends/sync.py", line 26, in read
return self._sock.recv(max_bytes)
TimeoutError: timed out
Managermodule should not fail to start when our agent loses communication with control plane. It should emit proper log that our agent can't connect to control plane instead. @Haishi2016 Our agent should also emit logs that it can't connect to the control plane
It is resolved