Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edgeHub module won't start after certificate renewal #7397

Open
ghost opened this issue Nov 25, 2024 · 5 comments
Open

edgeHub module won't start after certificate renewal #7397

ghost opened this issue Nov 25, 2024 · 5 comments
Assignees

Comments

@ghost
Copy link

ghost commented Nov 25, 2024

Expected Behavior

edgeHub module restart successfuly after cert renewal

Current Behavior

edgeHub shuts down and stay in this state until manual restart

Steps to Reproduce

Just running iotEdge on a device. No perticular scenario

Context (Environment)

PROD

Output of iotedge check

Click here

<Paste here between the triple backticks>

Device Information

  • ubuntu 18.04.6
  • x86_64

Runtime Versions

  • aziot-edged 1.4.10
  • Edge Agent 1.4
  • Edge Hub 1.4
  • Docker/Moby 20.10.24

Logs

aziot-edged logs
ov 17 09:35:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:35:41Z [INFO] - Watchdog checking Edge runtime status
Nov 17 09:35:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:35:41Z [INFO] - Edge runtime is running
Nov 17 09:36:17 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:17Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/sign?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/json
Nov 17 09:36:17 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:17Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- GET /trust-bundle?api-version=2018-06-28 {"accept": "application/json", "host": "workload.sock:80", "connection": "close"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/encrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/encrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/encrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- GET /trust-bundle?api-version=2018-06-28 {"accept": "application/json", "host": "workload.sock:80", "connection": "close"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/encrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/encrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:25 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:25Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:30 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:30Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:30 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:30Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:30 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:30Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:30 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:30Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:31 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:31Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:31 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:31Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:31 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:31Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/encrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:31 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:31Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:31 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:31Z [INFO] - <-- POST /modules/%24edgeHub/genid/638192622530868516/decrypt?api-version=2020-07-07 {"accept": "application/json", "host": "workload.sock:80", "connection": "close", "content-type": "application/j
Nov 17 09:36:31 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:31Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:36:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:41Z [INFO] - Watchdog checking Edge runtime status
Nov 17 09:36:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:36:41Z [INFO] - Edge runtime is running
Nov 17 09:37:37 016-AIZ aziot-edged[2336]: 2024-11-17T07:37:37Z [INFO] - <-- GET /systeminfo/resources?api-version=2022-08-03 {"accept": "application/json", "host": "mgmt.sock:80", "connection": "close"}
Nov 17 09:37:37 016-AIZ aziot-edged[2336]: 2024-11-17T07:37:37Z [INFO] - Querying system resources...
Nov 17 09:37:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:37:41Z [INFO] - Watchdog checking Edge runtime status
Nov 17 09:37:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:37:41Z [INFO] - Edge runtime is running
Nov 17 09:37:51 016-AIZ aziot-edged[2336]: 2024-11-17T07:37:51Z [INFO] - --> 200 {"content-type": "application/json"}
Nov 17 09:38:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:38:41Z [INFO] - Watchdog checking Edge runtime status
Nov 17 09:38:41 016-AIZ aziot-edged[2336]: 2024-11-17T07:38:41Z [INFO] - Edge runtime is running

edge-agent logs

<4> 2024-11-17 07:34:02.814 +00:00 [WRN] - Agent reconcile concluded with errors.
System.TimeoutException: Operation timed out
   at Microsoft.Azure.Devices.Edge.Util.TaskEx.<>c.<.cctor>b__27_0() in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/TaskEx.cs:line 114
   at Microsoft.Azure.Devices.Edge.Util.TaskEx.TimeoutAfter(Task task, TimeSpan timeout, Action action) in /mnt/vss/_work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/TaskEx.cs:line 160
   at Microsoft.Azure.Devices.Edge.Agent.Service.Program.MainAsync(IConfiguration configuration) in /mnt/vss/_work/1/s/edge-agent/src/Microsoft.Azure.Devices.Edge.Agent.Service/Program.cs:line 271
<6> 2024-11-17 07:38:48.976 +00:00 [INF] - Starting periodic operation refresh twin config...
<6> 2024-11-17 07:38:49.126 +00:00 [INF] - Obtained Edge agent twin from IoTHub with desired properties version 60 and reported properties version 334.
<6> 2024-11-17 07:38:49.134 +00:00 [INF] - Successfully completed periodic operation refresh twin config

edge-hub logs

 2024-11-17 07:36:25.013 +00:00 [INF] - Restarting process to perform server certificate renewal.
<6> 2024-11-17 07:36:25.021 +00:00 [INF] - Stopping the protocol heads...
<6> 2024-11-17 07:36:25.029 +00:00 [INF] - Closing protocol heads - (MQTT, AMQP, HTTP)
<6> 2024-11-17 07:36:25.037 +00:00 [INF] - Stopping MQTT protocol head
<6> 2024-11-17 07:36:25.062 +00:00 [INF] - Closing link Events for ------
<6> 2024-11-17 07:36:25.062 +00:00 [INF] - Closing link TwinSending for ------
<6> 2024-11-17 07:36:25.063 +00:00 [INF] - Closing link TwinReceiving for ------
<6> 2024-11-17 07:36:25.064 +00:00 [INF] - Closing link Events for ------
<6> 2024-11-17 07:36:25.065 +00:00 [INF] - Closing link MethodSending for ------
<6> 2024-11-17 07:36:25.065 +00:00 [INF] - Closing link MethodReceiving for ------
<6> 2024-11-17 07:36:25.068 +00:00 [INF] - Closing HTTP head
<6> 2024-11-17 07:36:25.069 +00:00 [INF] - Setting proxy inactive for ------
<6> 2024-11-17 07:36:25.069 +00:00 [INF] - Removing device connection for device ------
<6> 2024-11-17 07:36:25.066 +00:00 [INF] - Closing link TwinSending for ------
<6> 2024-11-17 07:36:25.072 +00:00 [INF] - Closing link TwinReceiving for ------
<6> 2024-11-17 07:36:25.073 +00:00 [INF] - Setting proxy inactive for ------
<6> 2024-11-17 07:36:25.073 +00:00 [INF] - Removing device connection for device ------
<6> 2024-11-17 07:36:25.082 +00:00 [INF] - Connection closed CorrelationId dd41fca5-8980-4c99-9ad2-e2db7345dd2a
<6> 2024-11-17 07:36:25.089 +00:00 [INF] - Connection closed CorrelationId a4113baf-8da2-4724-96fd-ed22a358592c
<6> 2024-11-17 07:36:25.105 +00:00 [INF] - "Request finished HTTP/1.1 GET https://016-aiz/$iothub/websocket - - - 101 - - 2018442415.1831ms"
<6> 2024-11-17 07:36:25.106 +00:00 [INF] - "Request finished HTTP/1.1 GET https://016-aiz/$iothub/websocket - - - 101 - - 2591836583.4646ms"
<6> 2024-11-17 07:36:25.126 +00:00 [INF] - Closed HTTP head
<6> 2024-11-17 07:36:25.201 +00:00 [INF] - Closing receiver in cloud proxy 39a84d63-4339-4fb3-9716-f5ee5ea8b817 for ------
<6> 2024-11-17 07:36:25.201 +00:00 [INF] - Closed cloud proxy 39a84d63-4339-4fb3-9716-f5ee5ea8b817 for ------
<6> 2024-11-17 07:36:25.201 +00:00 [INF] - Device connection removed for device ------
<6> 2024-11-17 07:36:25.209 +00:00 [INF] - Remove device connection for device ------
<6> 2024-11-17 07:36:25.211 +00:00 [INF] - Closing receiver in cloud proxy 9b0cec60-74c6-4833-9acb-6030cebc63ac for ------
<6> 2024-11-17 07:36:25.212 +00:00 [INF] - Closed cloud proxy 9b0cec60-74c6-4833-9acb-6030cebc63ac for ------
<6> 2024-11-17 07:36:25.212 +00:00 [INF] - Device connection removed for device ------
<6> 2024-11-17 07:36:25.212 +00:00 [INF] - Remove device connection for device ------
<6> 2024-11-17 07:36:25.720 +00:00 [INF] - Updated reported properties for ------
<6> 2024-11-17 07:36:31.160 +00:00 [INF] - Updated reported properties for ------
<6> 2024-11-17 07:37:10.210 +00:00 [INF] - Stopped MQTT protocol head
<6> 2024-11-17 07:37:10.210 +00:00 [INF] - Protocol heads stopped.
<6> 2024-11-17 07:37:10.260 +00:00 [INF] - Shutdown complete.
<6> 2024-11-17 07:37:10.270 +00:00 [INF] - Termination requested, initiating shutdown.
<6> 2024-11-17 07:37:10.270 +00:00 [INF] - Waiting for cleanup to finish
<6> 2024-11-17 07:37:10.270 +00:00 [INF] - Done with cleanup. Shutting down.
2024-11-18 14:41:05  Starting Edge Hub
2024-11-18 14:41:05  Starting Edge Hub

Additional Information

Please provide any additional information that may be helpful in understanding the issue.

@jlian
Copy link
Member

jlian commented Nov 26, 2024

EdgeHub shouldn't need to be manually restarted for certificate renewal

https://learn.microsoft.com/en-us/azure/iot-edge/how-to-manage-device-certificates?view=iotedge-1.5&tabs=ubuntu#renewal

@vadim-kovalyov (on-call) can you help take a look? Might be related to #7321

@vadim-kovalyov
Copy link
Contributor

I would check two things, that "restartPolicy": "always" for edgeHub container in the deployment manifest, and also that if the device was offline at the time the cert renewed then it indeed might be related to #7321. Although from the log snippet of EdgeAgent this doesn't look like offline case.

Also, just a not that 1.4 LTS version support ended this month, we have a bunch of fixes in 1.5 version, so I would try to upgrade to see if that helps.

@ghost
Copy link
Author

ghost commented Nov 27, 2024

Thank you for your responses. I've checked the restart policy, and it is set to "always" (which I am quite sure is the default).
We should upgrade the version ASAP anyway, so we'll check.
As this is a critical issue, I am afraid that I will have to write a script to restart the module when it goes down until the issue is clarified and fixed.
Any other ideas are more than welcome.

@vadim-kovalyov
Copy link
Contributor

Maybe you can provide more details on the device setup and share your config.toml file (please redact all sensitive info) so we can check that configuration is ok. For example, if using EST, the cert renewal only works with DPS CA-based provisioning and is not supported for any other types of provisioning. It looks like your setup is simpler, but wanted to double check.

@codeputer
Copy link

codeputer commented Dec 2, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants