edgeAgent automatically stopping all modules
JelleCaptic opened this issue · comments
Jelle Van Renterghem commented
Expected Behavior
I would assume modules can stay up without a certain time limit or auto-shutdown mechanism. On top of that, I would expect certain reasoning for automatically stopping modules.
Current Behavior
At a random timestamp the edgeAgent stopped all modules and executed a stop command for all my running modules
Steps to Reproduce
Provide a detailed set of steps to reproduce the bug.
- Up until now I was unable to re-produce the issue. I have other edge devices that have modules running for over 1 month without any issues.
Context (Environment)
Output of iotedge check
All iotedge checks are ok.
Device Information
- Host OS [Ubuntu 20.04.5 LTS]:
- Architecture [aarch64]:
- Container OS [Linux containers]:
Runtime Versions
- aziot-edged [1.4.20]:
- Edge Agent [1.4.29]:
- Edge Hub [1.4.29]:
- Docker/Moby :
Client:
Version: 23.0.7+azure-1
API version: 1.42
Go version: go1.19.12
Git commit: 92955ed461e91154ed722ed831e9f22ea93fd7e9
Built: Wed Aug 23 12:04:52 UTC 2023
OS/Arch: linux/arm64
Context: default
Server:
Engine:
Version: 23.0.7+azure-1
API version: 1.42 (minimum version 1.12)
Go version: go1.19.12
Git commit: 606a7a76949ec08cb26940037b1c71c0024c6b44
Built: Wed Sep 6 06:38:37 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.6.24-2
GitCommit: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
runc:
Version: 1.1.9-1
GitCommit: ccaecfcbc907d70a7aa870a6650887b901b25b82
docker-init:
Version: 0.19.0
GitCommit:
Logs
edge-agent logs
{"log":"\u003c6\u003e 2024-03-04 07:09:38.280 +00:00 [INF] - Starting periodic operation refresh twin config...\n","stream":"stdout","time":"2024-03-04T07:09:38.280722368Z"}
{"log":"\u003c6\u003e 2024-03-04 07:09:38.303 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 108 and reported properties version 871.\n","stream":"stdout","time":"2024-03-04T07:09:38.303908896Z"}
{"log":"\u003c6\u003e 2024-03-04 07:09:38.314 +00:00 [INF] - Successfully completed periodic operation refresh twin config\n","stream":"stdout","time":"2024-03-04T07:09:38.31436288Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.122 +00:00 [INF] - Termination requested, initiating shutdown.\n","stream":"stdout","time":"2024-03-04T07:13:25.123151872Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.123 +00:00 [INF] - Main thread terminated\n","stream":"stdout","time":"2024-03-04T07:13:25.123461504Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.131 +00:00 [INF] - Initiating shutdown cleanup.\n","stream":"stdout","time":"2024-03-04T07:13:25.141976576Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.133 +00:00 [INF] - Stopping all modules...\n","stream":"stdout","time":"2024-03-04T07:13:25.142001824Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.168 +00:00 [INF] - Waiting for cleanup to finish\n","stream":"stdout","time":"2024-03-04T07:13:25.1692384Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.270 +00:00 [INF] - Plan execution started for deployment -1\n","stream":"stdout","time":"2024-03-04T07:13:25.273090144Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.290 +00:00 [INF] - Executing command: \"Stop module db\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291056896Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module edge-data-manager\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291107968Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module api\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291170176Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module edge-api-frontend\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291252768Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.291 +00:00 [INF] - Executing command: \"Stop module edge-frontend\"\n","stream":"stdout","time":"2024-03-04T07:13:25.291321536Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.302 +00:00 [INF] - Updated reported properties\n","stream":"stdout","time":"2024-03-04T07:13:25.302620832Z"}
{"log":"2024-03-04 07:13:40 Starting Edge Agent\n","stream":"stdout","time":"2024-03-04T07:13:40.783627552Z"}
{"log":"2024-03-04 07:13:40 Changing ownership of storage folder: /tmp/edgeAgent to 13622\n","stream":"stdout","time":"2024-03-04T07:13:40.811136352Z"}
{"log":"2024-03-04 07:13:40 Changing ownership of backup folder: /tmp/edgeAgent_backup to 13622\n","stream":"stdout","time":"2024-03-04T07:13:40.817397472Z"}
{"log":"2024-03-04 07:13:40 Changing ownership of management socket: /var/run/iotedge/mgmt.sock\n","stream":"stdout","time":"2024-03-04T07:13:40.831135328Z"}
{"log":"2024-03-04 07:13:40 Completed necessary setup. Starting Edge Agent.\n","stream":"stdout","time":"2024-03-04T07:13:40.851940512Z"}
{"log":"2024-03-04 07:13:40.896 +00:00 Edge Agent Main()\n","stream":"stdout","time":"2024-03-04T07:13:40.907348672Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:41.206 +00:00 [INF] - Initializing Edge Agent.\n","stream":"stdout","time":"2024-03-04T07:13:41.232074208Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:41.417 +00:00 [INF] - Version - 1.4.29.85541525 (f049017a5072f85aec250e945ebb2ca52e4af3f4)\n","stream":"stdout","time":"2024-03-04T07:13:41.417553152Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:41.418 +00:00 [INF] - \n","stream":"stdout","time":"2024-03-04T07:13:41.418499744Z"}
edge-hub logs
{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" received FIN.\n","stream":"stdout","time":"2024-03-04T07:13:23.775449856Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\" disconnecting.\n","stream":"stdout","time":"2024-03-04T07:13:23.775573856Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\" stopped.\n","stream":"stdout","time":"2024-03-04T07:13:23.775600288Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:23.775 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" sending FIN because: \"\"The Socket transport's send loop completed gracefully.\"\"\n","stream":"stdout","time":"2024-03-04T07:13:23.775730112Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.127 +00:00 [INF] [EdgeHub] - Termination requested, initiating shutdown.\n","stream":"stdout","time":"2024-03-04T07:13:25.129527008Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.128 +00:00 [INF] [EdgeHub] - Stopping the protocol heads...\n","stream":"stdout","time":"2024-03-04T07:13:25.12964Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.130 +00:00 [INF] [EdgeHub] - Closing protocol heads - (MQTT, HTTP)\n","stream":"stdout","time":"2024-03-04T07:13:25.132696384Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.136 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.MqttProtocolHead] - Stopping MQTT protocol head\n","stream":"stdout","time":"2024-03-04T07:13:25.137041024Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.141 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Http.HttpProtocolHead] - Closing HTTP head\n","stream":"stdout","time":"2024-03-04T07:13:25.14174832Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.143 +00:00 [DBG] [Microsoft.AspNetCore.Hosting.Diagnostics] - Hosting shutdown\n","stream":"stdout","time":"2024-03-04T07:13:25.1432352Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.177 +00:00 [INF] [EdgeHub] - Waiting for cleanup to finish\n","stream":"stdout","time":"2024-03-04T07:13:25.178299328Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.256 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" received FIN.\n","stream":"stdout","time":"2024-03-04T07:13:25.257041696Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.259 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets] - Connection id \"\"ID\"\" sending FIN because: \"\"The Socket transport's send loop completed gracefully.\"\"\n","stream":"stdout","time":"2024-03-04T07:13:25.259576512Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.259 +00:00 [DBG] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\" disconnecting.\n","stream":"stdout","time":"2024-03-04T07:13:25.260128576Z"}
{"log":"\u003c4\u003e 2024-03-04 07:13:25.260 +00:00 [WRN] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel] - Error reading transport. CorrelationId ID\n","stream":"stdout","time":"2024-03-04T07:13:25.264976352Z"}
{"log":"System.Net.WebSockets.WebSocketException (0x80004005): The remote party closed the WebSocket connection without completing the close handshake.\n","stream":"stdout","time":"2024-03-04T07:13:25.265037504Z"}
{"log":" at System.Net.WebSockets.ManagedWebSocket.ThrowIfEOFUnexpected(Boolean throwOnPrematureClosure)\n","stream":"stdout","time":"2024-03-04T07:13:25.265044064Z"}
{"log":" at System.Net.WebSockets.ManagedWebSocket.EnsureBufferContainsAsync(Int32 minimumRequiredBytes, CancellationToken cancellationToken, Boolean throwOnPrematureClosure)\n","stream":"stdout","time":"2024-03-04T07:13:25.265048352Z"}
{"log":" at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)\n","stream":"stdout","time":"2024-03-04T07:13:25.265078624Z"}
{"log":" at System.Net.WebSockets.ManagedWebSocket.ReceiveAsyncPrivate[TResult](Memory`1 payloadBuffer, CancellationToken cancellationToken)\n","stream":"stdout","time":"2024-03-04T07:13:25.265084608Z"}
{"log":" at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource\u003cTResult\u003e.GetResult(Int16 token)\n","stream":"stdout","time":"2024-03-04T07:13:25.265088352Z"}
{"log":" at System.Threading.Tasks.ValueTask`1.ValueTaskSourceAsTask.\u003c\u003ec.\u003c.cctor\u003eb__4_0(Object state)\n","stream":"stdout","time":"2024-03-04T07:13:25.265092256Z"}
{"log":"--- End of stack trace from previous location ---\n","stream":"stdout","time":"2024-03-04T07:13:25.265095712Z"}
{"log":" at Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel.DoReadBytes(IByteBuffer byteBuffer) in /mnt/vss/_work/1/s/edge-hub/core/src/Microsoft.Azure.Devices.Edge.Hub.Mqtt/ServerWebSocketChannel.cs:line 239\n","stream":"stdout","time":"2024-03-04T07:13:25.26509904Z"}
{"log":" at Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel.DoBeginRead() in /mnt/vss/_work/1/s/edge-hub/core/src/Microsoft.Azure.Devices.Edge.Hub.Mqtt/ServerWebSocketChannel.cs:line 133\n","stream":"stdout","time":"2024-03-04T07:13:25.265104608Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.265 +00:00 [INF] [Microsoft.Azure.Devices.Edge.Hub.Mqtt.ServerWebSocketChannel] - Transport aborted. CorrelationId ID\n","stream":"stdout","time":"2024-03-04T07:13:25.26565424Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.266 +00:00 [INF] [Microsoft.AspNetCore.Server.Kestrel.Connections] - Connection id \"\"ID\"\", Request id \"\"ID:00000002\"\": the application aborted the connection.\n","stream":"stdout","time":"2024-03-04T07:13:25.266200416Z"}
{"log":"\u003c7\u003e 2024-03-04 07:13:25.266 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Hub.Http.Middleware.WebSocketHandlingMiddleware] - Request ID:00000002 completed. CorrelationId ID\n","stream":"stdout","time":"2024-03-04T07:13:25.269377952Z"}
{"log":"\u003c6\u003e 2024-03-04 07:13:25.266 +00:00 [INF] [Microsoft.AspNetCore.Hosting.Diagnostics] - \"Request finished HTTP/1.1 GET https://gsd7449186/$iothub/websocket - - - 101 - - 956745.0866ms\"\n","stream":"stdout","time":"2024-03-04T07:13:25.269454272Z"}
Additional Information
All the modules came back up after a couple of minutes automatically.
What I would like to get more info about:
- What triggers the automatic shutdown of all modules because this needs to be tackled.
- Is there a way to prevent this?
davidE commented
@bilalsellak can you take a look here?