Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device certificate does not renew #5787

Closed
curua2008 opened this issue Nov 2, 2021 · 7 comments
Closed

Device certificate does not renew #5787

curua2008 opened this issue Nov 2, 2021 · 7 comments

Comments

@curua2008
Copy link

curua2008 commented Nov 2, 2021

Expected Behavior

When using EST certificate providers like GlobalSign or DigiCert, IoT Edge should renew the Device ID certificate when it expires.

This happens when we have the device certificate setting in confim.toml as show below:

[provisioning.attestation]
method = "x509"
registration_id = "xxxx"
identity_cert = { method = "est", common_name = "xxx", url ="https://xxiden.est.edge.dev.globalsign.com:443/.well-known/est/" }

Current Behavior

iotedge was able to obtain a device certificate from EST server as shown below

image

Device certificate expired after 2 days

image

No certificate renewal happens

image

and i noticed several error in iotedge system logs as below after the certificate expired.

Oct 28 20:05:32 xxx aziot-identityd[15390]: 2021-10-28T20:05:32Z [DBUG] - [aziot_hub_client_async] IoTHub response status 401
Oct 28 20:05:32 xxx aziot-identityd[15390]: 2021-10-28T20:05:32Z [DBUG] - [aziot_hub_client_async] IoTHub response headers{"content-length": "161", "content-type": "application/json; charset=utf-8", "server": "Microsoft-HTTPAPI/2.0", "x-ms-request-id": "4f48c905-0844-4bfd-8f20-cd8fc367c4f3", "iothub-errorcode": "IotHubUnauthorizedAccess", "date": "Thu, 28 Oct 2021 20:05:31 GMT"}
Oct 28 20:05:32 xxx aziot-identityd[15390]: 2021-10-28T20:05:32Z [INFO] - !!! Hub client error
Oct 28 20:05:32 xxx aziot-identityd[15390]: 2021-10-28T20:05:32Z [INFO] - !!! caused by: ErrorCode:IotHubUnauthorizedAccess;Unauthorized

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  1. Set up an EST server with Device cert expires in 2 days
  2. Configure IotEdge using a config file similar to the one below
[provisioning]
source = "dps"
global_endpoint = "https://global.azure-devices-provisioning.net/"
id_scope = "0ne0xxxxA7F0"

[provisioning.attestation]
method = "x509"
registration_id = "xxx-bootstrap-globalsign-vnm2"

identity_cert = { method = "est", common_name = "xxxxootstrap-globalsign-vnm2", url ="https://xxx.est.edge.dev.globalsign.com:443/.well-known/est/" }

[cert_issuance.est]
trusted_certs = [
        "file:///var/secrets/globalsign_root.pem",
]

[cert_issuance.est.auth]
#username = "xxx"
#password = "xx"

# EST ID cert requested via EST bootstrap ID cert
bootstrap_identity_cert = "file:///var/secrets/xxx_bootstrap_gs.pem"

bootstrap_identity_pk = "pkcs11:token=IoTEdgeCert;object=bootstrap-rsa-pair?pin-value=xxx" # PKCS#11 URI

[cert_issuance.est.urls]
#default = "https://xxxiden.est.edge.dev.globalsign.com:443/.well-known/est/"

[aziot_keys]
pkcs11_lib_path = "/usr/local/lib/libtpm2_pkcs11.so"
#pkcs11_base_slot = "pkcs11:token=IoTEdgeCert?pin-value=xxx"
[edge_ca]

method = "est"
#common_name = "xx test Edge CA"
url = "https://xxxdevca.est.edge.dev.globalsign.com:443/.well-known/est/"
  1. Note the device cert does not renew automatically by iotedge

Context (Environment)

Host OS [e.g. Ubuntu 18.04, Windows Server IoT 2019]: Ubuntu 18.04
Architecture [e.g. amd64, arm32, arm64]: amd64
Container OS [e.g. Linux containers, Windows containers]: Linux

Output of iotedge check

Configuration checks (aziot-identity-service)
---------------------------------------------
√ keyd configuration is well-formed - OK
√ certd configuration is well-formed - OK
√ tpmd configuration is well-formed - OK
√ identityd configuration is well-formed - OK
√ daemon configurations up-to-date with config.toml - OK
√ identityd config toml file specifies a valid hostname - OK
‼ aziot-identity-service package is up-to-date - Warning
    Installed aziot-identity-service package has version 1.3.0 but 1.2.3 is the latest stable version available.
    Please see https://aka.ms/aziot-update-runtime for update instructions.
√ host time is close to reference time - OK
× production readiness: identity certificates expiry - Error
    DPS identity 'device-id' expired at 2021-10-30 20:36:10 UTC
× production readiness: EST identity and bootstrap certificates expiry - Error
    x509 identity 'est-id' expired at 2021-10-30 20:36:09 UTC
√ preloaded certificates are valid - OK
√ keyd is running - OK
√ certd is running - OK
√ identityd is running - OK
√ read all preloaded certificates from the Certificates Service - OK
√ read all preloaded key pairs from the Keys Service - OK
√ ensure all preloaded certificates match preloaded private keys with the same ID - OK

Connectivity checks (aziot-identity-service)
--------------------------------------------
‼ host can connect to and perform TLS handshake with iothub AMQP port - Warning
    Could not retrieve iothub_hostname from provisioning file.
    Please specify the backing IoT Hub name using --iothub-hostname switch if you have that information.
    Since no hostname is provided, all hub connectivity tests will be skipped.
‼ host can connect to and perform TLS handshake with iothub HTTPS / WebSockets port - Warning
    Could not retrieve iothub_hostname from provisioning file.
    Please specify the backing IoT Hub name using --iothub-hostname switch if you have that information.
    Since no hostname is provided, all hub connectivity tests will be skipped.
‼ host can connect to and perform TLS handshake with iothub MQTT port - Warning
    Could not retrieve iothub_hostname from provisioning file.
    Please specify the backing IoT Hub name using --iothub-hostname switch if you have that information.
    Since no hostname is provided, all hub connectivity tests will be skipped.
√ host can connect to and perform TLS handshake with DPS endpoint - OK

Configuration checks
--------------------
√ aziot-edged configuration is well-formed - OK
√ configuration up-to-date with config.toml - OK
√ container engine is installed and functional - OK
× configuration has correct URIs for daemon mgmt endpoint - Error
    Unable to find image 'mcr.microsoft.com/azureiotedge-diagnostics:1.2.420211006.4' locally
    docker: Error response from daemon: manifest for mcr.microsoft.com/azureiotedge-diagnostics:1.2.420211006.4 not found: manifest unknown: manifest tagged by "1.2.420211006.4" is not found.
    See 'docker run --help'.
‼ aziot-edge package is up-to-date - Warning
    Installed IoT Edge daemon has version 1.2.420211006.4 but 1.2.4 is the latest stable version available.
    Please see https://aka.ms/iotedge-update-runtime for update instructions.
× container time is close to host time - Error
    Could not query local time inside container
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
√ production readiness: container engine - OK
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.

Connectivity checks
-------------------
19 check(s) succeeded.
9 check(s) raised warnings. Re-run with --verbose for more details.
4 check(s) raised errors. Re-run with --verbose for more details.
7 check(s) were skipped due to errors from other checks. Re-run with --verbose for more details.

Device Information

Host OS [e.g. Ubuntu 18.04, Windows Server IoT 2019]: Ubuntu 18.04
Architecture [e.g. amd64, arm32, arm64]: amd64
Container OS [e.g. Linux containers, Windows containers]: Linux

Runtime Versions

iotedge 1.2.420211006.4

aziot-edged [run iotedge version]: https://github.com/Azure/iot-identity-service/suites/3964124249/artifacts/99607813
Edge Agent [image tag (e.g. 1.0.0)]:
Edge Hub [image tag (e.g. 1.0.0)]:
Docker/Moby [run docker version]:

Note: when using Windows containers on Windows, run docker -H npipe:////./pipe/iotedge_moby_engine version instead

Logs

aziot-edged logs

<Paste here between the triple backticks>

edge-agent logs

<Paste here between the triple backticks>

edge-hub logs

<Paste here between the triple backticks>

Additional Information

Please provide any additional information that may be helpful in understanding the issue.

@onalante-msft
Copy link
Contributor

Similar to the edge CA, the device ID certificate is not reissued unless requested. This can be manually triggered by POSTing to /device/reprovision on the management socket. The edge agent does call this endpoint, but not in response to the identity certificate nearing its expiry date: source.

@curua2008
Copy link
Author

@onalante-msft are you saying that iotedge will not automatically renew the decice ID cert after it expires?

@ksaye
Copy link
Contributor

ksaye commented Nov 5, 2021

@onalante-msft, after some testing, I think it is related to: Azure/iot-identity-service#300.

On my EST Server, I see the error messages:

/libest/src/est/.libs/libest-3.2.0p.so(+0xa520) [0x7f4073906520]
/libest/src/est/.libs/libest-3.2.0p.so(est_server_handle_request+0x29e) [0x7f40739187de]
/libest/example/server/.libs/estserver(+0xea86) [0x560ecc545a86]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f4072f8c6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f4072cb571f]

***EST [INFO][ossl_verify_cb:162]--> enter function: ok=1 cert_error=0
***EST [INFO][ossl_verify_cb:162]--> enter function: ok=0 cert_error=10
***EST [INFO][ossl_verify_cb:169]--> Cert: CN = nobootstrap
***EST [INFO][ossl_verify_cb:176]--> error 10 at 0 depth lookup:certificate has expired

***EST [WARNING][ossl_verify_cb:209]--> Certificate verify failed (reason=10)

/libest/src/est/.libs/libest-3.2.0p.so(+0xa520) [0x7f4073906520]
/libest/src/est/.libs/libest-3.2.0p.so(ossl_verify_cb+0x12e) [0x7f407391ecee]
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1(+0x1f2953) [0x7f4073396953]
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1(+0x1f2b49) [0x7f4073396b49]
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1(+0x1f4776) [0x7f4073398776]
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1(X509_verify_cert+0x96) [0x7f4073398f96]
/usr/lib/x86_64-linux-gnu/libssl.so.1.1(+0x2f337) [0x7f407369e337]
/usr/lib/x86_64-linux-gnu/libssl.so.1.1(+0x5e8eb) [0x7f40736cd8eb]
/usr/lib/x86_64-linux-gnu/libssl.so.1.1(+0x4c48c) [0x7f40736bb48c]
/usr/lib/x86_64-linux-gnu/libssl.so.1.1(SSL_do_handshake+0x54) [0x7f40736a74e4]
/libest/src/est/.libs/libest-3.2.0p.so(est_server_handle_request+0x24d) [0x7f407391878d]
/libest/example/server/.libs/estserver(+0xea86) [0x560ecc545a86]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f4072f8c6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f4072cb571f]

And IoT Edge keeps trying to renew (and fail).

@onalante-msft
Copy link
Contributor

onalante-msft commented Nov 5, 2021

@curua2008 Yes, unfortunately this is the current state of affairs.

@ksaye That is the tracking issue for this feature, you are correct. The reason why certd fails to authenticate in your case is due to the code here: if an identity certificate exists, there is no fallback authentication method (or re-bootstrapping) in case the EST call fails.

@github-actions
Copy link

github-actions bot commented Dec 9, 2021

This issue is being marked as stale because it has been open for 30 days with no activity.

@vjrantal
Copy link
Contributor

I think this issue is important to be addressed because at least based on our testing, there is no other way to recover other than to restart the runtime. Restart of the Edge Agent or Edge Hub modules does not seem to be enough.

I collected some relevant logs here when the system is in a state where the device id certificate is expired. Interestingly, the Edge Hub was still able to connect to the IoT Hub (I don't understand why), but any module deployments failed. Restart of Edge Agent worked from the Azure portal but as mentioned, this didn't trigger recovery (renewal of the device id certificate).

In our case, the EST server credentials were valid (even when the device id certificate had expired) so restarting the runtime triggered reprovision via DPS (we have auto_reprovisioning_mode = "AlwaysOnStartup"). Furthermore, the reprovision triggers a request to renew device id via EST and the system is back to a functional state.

The challenge in the recovery is that if someone doesn't have SSH or other remote access to the devices, it might be difficult to trigger the recovery behavior.

@gordonwang0
Copy link
Contributor

We have implemented a configurable auto-renewal of the device ID and EST identity certs.

The option is available here for device identity certs:
https://github.com/Azure/iot-identity-service/blob/e5a0e0bd98c5aadb9fc8b6830e45203fcd9f232d/aziotctl/config/unix/template.toml#L132-L134

And here for EST identity certs:
https://github.com/Azure/iot-identity-service/blob/e5a0e0bd98c5aadb9fc8b6830e45203fcd9f232d/aziotctl/config/unix/template.toml#L182-L184

Note that this feature isn't released yet and will go in a future release. However, you can try it out now by downloading the latest packages from iot-identity-service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants