-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iotedge check failed to report connectivity errors (was "iotedge 1.1.4 reinstall on Raspberry Pi OS fails to start") #5396
Comments
Thanks, @kimhanse for your report. |
Connection string: All 9 devices are connected to the same IoT Hub. I am also running 3 other similar devices connected to a different IoT Hub, they are not affected by this problem. |
I have run the first startup of the security manager on both the device in the field where we have the problem and a test setup we have in the office. I am using the same connection string for both systems, and I have verified that all installed packages etc. are the same on the two devices. I have enabled debug logging in order to get some more information but I can not see why the systems are acting differently. I am not sending the entire log, but all the earlier parts that I have removed are practically the same for the two runs. tracey-1 failing correct start
tracey-2 starting correctly
|
Hi @pmzara, did you see my response above? |
@kimhanse |
@pmzara I think you are right in suspecting the DNS: DNS lookup from tracey-1 (the failing field device):
DNS lookup from tracey-2 (working device in the office):
And when I look at the other devices the working one all resolve the address to 20.50.65.131 while the broken ones resolve to 13.79.172.43. The internet connection in the field is using satellites, we are not using VPN or proxy. There is a firewall but it should be open enough for Iot Hub to work. |
An other point that suggests that the problem is the DNS is that if I insert the following list in
We are using 8.8.8.8 for DNS and are getting the following result
|
@kimhanse - Thanks for the update, do you need any more assistance with this issue? |
@veyalla No, I don't need more help with my local problem. But the error reporting here is non-existent and that caused me to spend days debugging a pretty simple problem. It should be possible for the security manager to realize that the connection is not working and report that in the log with a clear error message. |
This seems like a networking issue, DNS is resolving to stale IP addresses on some systems. I'm not sure how we can detect this in the application layer. I presume 'iotedge check' was showing connectivity check errors on the broken systems? |
@veyalla 'iotedge check' did not show any connectivity errors, see the details at the top of the story. It only returned 3 errors and they were all related to the fact that edgeAgent and edgeHub were not running. It should be easy to detect that you don't get in touch with IoT Hub in Azure. |
This issue is being marked as stale because it has been open for 30 days with no activity. |
Renaming this issue to "iotedge check failed to report connectivity errors" |
We have 9 devices running with iotedge on vessels around the world, 6 of them failed within minutes of each other on July 29. They just stopped reporting to Azure and started giving different errors in the logs. We have not been able for figure out what happened.
Upgrading some packages and restarting the devices a few times have brought 3 of them back online and working again. We still have 3 devices that are stuck and then are broken in the same way. We have tried to get back to a clean slate and thus uninstalled and reinstall iotedge, even with that we keep getting the same error. The security daemon starts, but does not try to pull the docker images for the agent or hub.
Expected Behavior
We except the security daemon to pull and start the iotedge agent and hub. Or to at least report why that doesn't happen.
Current Behavior
When the fresh install of iotedge is starting up it fails to start the service modules, instead it complaints about failed connections to them:
Steps to Reproduce
Provide a detailed set of steps to reproduce the bug.
Context (Environment)
Output of
iotedge check
Click here
Device Information
Runtime Versions
Logs
iotedged and system logs
Logs from `journalctl` ``` Aug 20 11:47:21 tracey-1 systemd[1]: Reloading. Aug 20 11:47:21 tracey-1 systemd[1]: Starting Azure IoT Edge daemon management socket. Aug 20 11:47:21 tracey-1 systemd[1]: Starting Azure IoT Edge daemon workload socket. Aug 20 11:47:21 tracey-1 systemd[1]: Listening on Azure IoT Edge daemon management socket. Aug 20 11:47:21 tracey-1 systemd[1]: Listening on Azure IoT Edge daemon workload socket. Aug 20 11:47:21 tracey-1 systemd[1]: Started Azure IoT Edge daemon. Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Starting Azure IoT Edge Security Daemon Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Version - 1.1.4 Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Using config file: /etc/iotedge/config.yaml Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Configuring /var/lib/iotedge as the home directory. Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Configuring certificates... Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Transparent gateway certificates not found, operating in quick start mode... Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Finished configuring provisioning environment variables and certificates. Aug 20 11:47:21 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] - Initializing hsm... Aug 20 11:47:22 tracey-1 systemd[1]: Reloading. Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:44Z [INFO] - Finished initializing hsm. Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:44Z [INFO] - Provisioning edge device... Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:44Z [INFO] - Starting provisioning edge device via manual mode using a device connection string... Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:44Z [ERR!] - The daemon could not start up successfully: Could not load settings Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:44Z [ERR!] - caused by: Edge device information is required. Aug 20 11:47:44 tracey-1 iotedged[5499]: Please update config.yaml with the IoT Hub connection information. Aug 20 11:47:44 tracey-1 iotedged[5499]: See https://aka.ms/iot-edge-configure-linux for more details. Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/hsm_log.c:log_init:41) Initialized logging Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/edge_hsm_client_store.c:generate_edge_hsm_certificates_if_needed:1629) Load status 1. Regenerating owner and device CA certs and keys Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:21Z [INFO] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/edge_pki_openssl.c:generate_rsa_key:346) Generating RSA key of length 4096 Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:33Z [INFO] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/edge_pki_openssl.c:generate_rsa_key:346) Generating RSA key of length 4096 Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:44Z [ERR!] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/hsm_utils.c:read_file_into_buffer_impl:167) Could not open file for reading /var/lib/iotedge/hsm/enc_keys/edgelet-masterWt5mT2xpO72EPKlt2Tt0Sq4uJCrMvfl2rzzKRB3pnyo_.enc.key. Errno 2 'No such file or directory' Aug 20 11:47:44 tracey-1 iotedged[5499]: 2021-08-20T11:47:44Z [INFO] (/project/edgelet/hsm-sys/azure-iot-hsm-c/src/edge_hsm_client_store.c:load_encryption_key_from_file:818) Could not read encryption key from file. No key file exists or is invalid or permission error. Aug 20 11:47:44 tracey-1 systemd[1]: iotedge.service: Main process exited, code=exited, status=153/n/a Aug 20 11:47:44 tracey-1 systemd[1]: iotedge.service: Failed with result 'exit-code'. Aug 20 11:47:51 tracey-1 systemd[1]: iotedge.mgmt.socket: Succeeded. Aug 20 11:47:51 tracey-1 systemd[1]: Closed Azure IoT Edge daemon management socket. Aug 20 11:47:51 tracey-1 systemd[1]: Stopping Azure IoT Edge daemon management socket. Aug 20 11:47:51 tracey-1 systemd[1]: Starting Azure IoT Edge daemon management socket. Aug 20 11:47:51 tracey-1 systemd[1]: iotedge.socket: Succeeded. Aug 20 11:47:51 tracey-1 systemd[1]: Closed Azure IoT Edge daemon workload socket. Aug 20 11:47:51 tracey-1 systemd[1]: Stopping Azure IoT Edge daemon workload socket. Aug 20 11:47:51 tracey-1 systemd[1]: Starting Azure IoT Edge daemon workload socket. Aug 20 11:47:51 tracey-1 systemd[1]: Listening on Azure IoT Edge daemon management socket. Aug 20 11:47:51 tracey-1 systemd[1]: Listening on Azure IoT Edge daemon workload socket. Aug 20 11:47:51 tracey-1 systemd[1]: Started Azure IoT Edge daemon. Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Starting Azure IoT Edge Security Daemon Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Version - 1.1.4 Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Using config file: /etc/iotedge/config.yaml Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Configuring /var/lib/iotedge as the home directory. Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Configuring certificates... Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Configuring the Device CA certificate using "/etc/iotedge/azure-iot.root.ca.cert.pem". Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Configuring the Device private key using "/etc/iotedge/azure-iot.root.ca.key.pem". Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Configuring the trusted CA certificates using "/etc/iotedge/azure-iot.root.ca.cert.pem". Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Finished configuring provisioning environment variables and certificates. Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Initializing hsm... Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Finished initializing hsm. Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Provisioning edge device... Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Starting provisioning edge device via manual mode using a device connection string... Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Manually provisioning device "tracey-1" in hub "CSP-Autolog-IoT.azure-devices.net" Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Finished provisioning edge device. Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Initializing the module runtime... Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Initializing module runtime... Aug 20 11:47:51 tracey-1 iotedged[5587]: 2021-08-20T11:47:51Z [INFO] - Using runtime network id azure-iot-edge Aug 20 11:47:51 tracey-1 systemd-udevd[153]: /lib/udev/rules.d/80-moby-engine.rules:1: Invalid key/value pair, starting at character 1 ('c') Aug 20 11:47:51 tracey-1 systemd-udevd[5608]: Using default interface naming scheme 'v240'. Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Successfully initialized module runtime Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Finished initializing the module runtime. Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Stopping all modules... Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Finished stopping modules. Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Detecting if configuration file has changed... Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Change to configuration file detected. Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Removing all modules... Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Finished removing modules. Aug 20 11:47:52 tracey-1 iotedged[5587]: 2021-08-20T11:47:52Z [INFO] - Edge issuer CA expiration date: 2034-07-23T21:00:33Z Aug 20 11:48:03 tracey-1 iotedged[5587]: 2021-08-20T11:48:03Z [INFO] - Starting management API... Aug 20 11:48:03 tracey-1 iotedged[5587]: 2021-08-20T11:48:03Z [INFO] - Starting workload API... Aug 20 11:48:03 tracey-1 iotedged[5587]: 2021-08-20T11:48:03Z [INFO] - Starting watchdog with 60 second frequency... Aug 20 11:48:03 tracey-1 iotedged[5587]: 2021-08-20T11:48:03Z [INFO] - Listening on fd://iotedge.mgmt.socket/ with 1 thread for management API. Aug 20 11:48:03 tracey-1 iotedged[5587]: 2021-08-20T11:48:03Z [INFO] - Listening on fd://iotedge.socket/ with 1 thread for workload API. Aug 20 11:48:03 tracey-1 iotedged[5587]: 2021-08-20T11:48:03Z [INFO] - Checking edge runtime status Aug 20 11:48:03 tracey-1 iotedged[5587]: 2021-08-20T11:48:03Z [INFO] - Creating and starting edge runtime module edgeAgent Aug 20 11:48:06 tracey-1 iotedged[5587]: 2021-08-20T11:48:06Z [WARN] - Error in watchdog when checking for edge runtime status: Aug 20 11:48:06 tracey-1 iotedged[5587]: 2021-08-20T11:48:06Z [WARN] - A module runtime error occurred. Aug 20 11:48:06 tracey-1 iotedged[5587]: 2021-08-20T11:48:06Z [WARN] - caused by: Could not get identity $edgeAgent Aug 20 11:48:06 tracey-1 iotedged[5587]: 2021-08-20T11:48:06Z [WARN] - caused by: Could not get module $edgeAgent Aug 20 11:48:06 tracey-1 iotedged[5587]: 2021-08-20T11:48:06Z [WARN] - caused by: HTTP request failed: [401 Unauthorized] {"Message":"ErrorCode:IotHubNotFound;Unauthorized","ExceptionMessage":"Tracking ID:ab11464ec5db43d69b8a38bdd0e1777e-G:3-TimeStamp:08/20/2021 11:48:06"} Aug 20 11:49:03 tracey-1 iotedged[5587]: 2021-08-20T11:49:03Z [INFO] - Checking edge runtime status Aug 20 11:49:03 tracey-1 iotedged[5587]: 2021-08-20T11:49:03Z [INFO] - Creating and starting edge runtime module edgeAgent Aug 20 11:49:04 tracey-1 iotedged[5587]: 2021-08-20T11:49:04Z [WARN] - Error in watchdog when checking for edge runtime status: Aug 20 11:49:04 tracey-1 iotedged[5587]: 2021-08-20T11:49:04Z [WARN] - A module runtime error occurred. Aug 20 11:49:04 tracey-1 iotedged[5587]: 2021-08-20T11:49:04Z [WARN] - caused by: Could not get identity $edgeAgent Aug 20 11:49:04 tracey-1 iotedged[5587]: 2021-08-20T11:49:04Z [WARN] - caused by: Could not get module $edgeAgent Aug 20 11:49:04 tracey-1 iotedged[5587]: 2021-08-20T11:49:04Z [WARN] - caused by: HTTP request failed: [401 Unauthorized] {"Message":"ErrorCode:IotHubNotFound;Unauthorized","ExceptionMessage":"Tracking ID:ede6e6809e354ab09d9a901ec5b0a684-G:3-TimeStamp:08/20/2021 11:49:04"} Aug 20 11:50:03 tracey-1 iotedged[5587]: 2021-08-20T11:50:03Z [INFO] - Checking edge runtime status Aug 20 11:50:03 tracey-1 iotedged[5587]: 2021-08-20T11:50:03Z [INFO] - Creating and starting edge runtime module edgeAgent Aug 20 11:50:04 tracey-1 iotedged[5587]: 2021-08-20T11:50:04Z [WARN] - Error in watchdog when checking for edge runtime status: Aug 20 11:50:04 tracey-1 iotedged[5587]: 2021-08-20T11:50:04Z [WARN] - A module runtime error occurred. Aug 20 11:50:04 tracey-1 iotedged[5587]: 2021-08-20T11:50:04Z [WARN] - caused by: Could not get identity $edgeAgent Aug 20 11:50:04 tracey-1 iotedged[5587]: 2021-08-20T11:50:04Z [WARN] - caused by: Could not get module $edgeAgent Aug 20 11:50:04 tracey-1 iotedged[5587]: 2021-08-20T11:50:04Z [WARN] - caused by: HTTP request failed: [401 Unauthorized] {"Message":"ErrorCode:IotHubNotFound;Unauthorized","ExceptionMessage":"Tracking ID:fa2aa3c55b92462eaaff7d3fb44bca99-G:3-TimeStamp:08/20/2021 11:50:03"} Aug 20 11:51:03 tracey-1 iotedged[5587]: 2021-08-20T11:51:03Z [INFO] - Checking edge runtime status Aug 20 11:51:03 tracey-1 iotedged[5587]: 2021-08-20T11:51:03Z [INFO] - Creating and starting edge runtime module edgeAgent Aug 20 11:51:04 tracey-1 iotedged[5587]: 2021-08-20T11:51:04Z [WARN] - Error in watchdog when checking for edge runtime status: Aug 20 11:51:04 tracey-1 iotedged[5587]: 2021-08-20T11:51:04Z [WARN] - A module runtime error occurred. Aug 20 11:51:04 tracey-1 iotedged[5587]: 2021-08-20T11:51:04Z [WARN] - caused by: Could not get identity $edgeAgent Aug 20 11:51:04 tracey-1 iotedged[5587]: 2021-08-20T11:51:04Z [WARN] - caused by: Could not get module $edgeAgent Aug 20 11:51:04 tracey-1 iotedged[5587]: 2021-08-20T11:51:04Z [WARN] - caused by: HTTP request failed: [401 Unauthorized] {"Message":"ErrorCode:IotHubNotFound;Unauthorized","ExceptionMessage":"Tracking ID:f342d45d04c44bc5ab2221a6b91db618-G:3-TimeStamp:08/20/2021 11:51:04"} Aug 20 11:52:03 tracey-1 iotedged[5587]: 2021-08-20T11:52:03Z [INFO] - Checking edge runtime status Aug 20 11:52:03 tracey-1 iotedged[5587]: 2021-08-20T11:52:03Z [INFO] - Creating and starting edge runtime module edgeAgent Aug 20 11:52:04 tracey-1 iotedged[5587]: 2021-08-20T11:52:04Z [WARN] - Error in watchdog when checking for edge runtime status: Aug 20 11:52:04 tracey-1 iotedged[5587]: 2021-08-20T11:52:04Z [WARN] - A module runtime error occurred. Aug 20 11:52:04 tracey-1 iotedged[5587]: 2021-08-20T11:52:04Z [WARN] - caused by: Could not get identity $edgeAgent Aug 20 11:52:04 tracey-1 iotedged[5587]: 2021-08-20T11:52:04Z [WARN] - caused by: Could not get module $edgeAgent Aug 20 11:52:04 tracey-1 iotedged[5587]: 2021-08-20T11:52:04Z [WARN] - caused by: HTTP request failed: [401 Unauthorized] {"Message":"ErrorCode:IotHubNotFound;Unauthorized","ExceptionMessage":"Tracking ID:2011dd76518f4b5c8d62a0ad78154c59-G:3-TimeStamp:08/20/2021 11:52:04"} ```iotedge list
``` root@tracey-1:~# iotedge list NAME STATUS DESCRIPTION CONFIG root@tracey-1:~# ```No edge-agent logs or edge-hub logs as the modules aren't even loaded.
Additional Information
Please provide any additional information that may be helpful in understanding the issue.
The text was updated successfully, but these errors were encountered: