Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edgeHub not starting with "Error: failed to deserialize state" error #5152

Open
kvasdopil opened this issue Jun 24, 2021 · 3 comments
Open
Assignees
Labels
1.1.3 Targeted for 1.1.3 release area:mqtt MQTT broker bug Something isn't working customer-reported iotedge no-issue-activity

Comments

@kvasdopil
Copy link

Expected Behaviour

After hard reboot iotedge services start normally.

Current Behaviour

After device start after a hard reboot, the edgeHub module does not start normally.

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  1. MQTT Broker should be enabled
  2. perform hard reboot on the device
  3. start the device

Context (Environment)

Output of iotedge check

Click here

Configuration checks
--------------------
√ config.yaml is well-formed - OK
√ config.yaml has well-formed connection string - OK
√ container engine is installed and functional - OK
× config.yaml has correct hostname - Error
    config.yaml has hostname zsolt-devbook but device reports hostname lexa-rpi.
    Hostname in config.yaml must either be identical to the device hostname or be a fully-qualified domain name that has the device hostname as the first component.
√ config.yaml has correct URIs for daemon mgmt endpoint - OK
√ latest security daemon - OK
√ host time is close to real time - OK
√ container time is close to host time - OK
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
√ production readiness: identity certificates expiry - OK
‼ production readiness: certificates - Warning
    The Edge device is using self-signed automatically-generated development certificates.
    They will expire in 74 days (at 2021-09-06 12:22:20 UTC) causing module-to-module and downstream device communication to fail on an active deployment.
    After the certs have expired, restarting the IoT Edge daemon will trigger it to generate new development certs.
    Please consider using production certificates instead. See https://aka.ms/iotedge-prod-checklist-certs for best practices.
√ production readiness: container engine - OK
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.

Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with DPS endpoint - OK
√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK
√ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK
√ container on the default network can connect to IoT Hub AMQP port - OK
√ container on the default network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the default network can connect to IoT Hub MQTT port - OK
√ container on the IoT Edge module network can connect to IoT Hub AMQP port - OK
√ container on the IoT Edge module network can connect to IoT Hub HTTPS / WebSockets port - OK
√ container on the IoT Edge module network can connect to IoT Hub MQTT port - OK

19 check(s) succeeded.
5 check(s) raised warnings. Re-run with --verbose for more details.
1 check(s) raised errors. Re-run with --verbose for more details.

Device Information

  • Host OS [e.g. Ubuntu 18.04, Windows Server IoT 2019]: ubuntu hirsute, 5.11.0-1009-raspi
  • Architecture [e.g. amd64, arm32, arm64]: arm64
  • Container OS [e.g. Linux containers, Windows containers]: linux containers

Runtime Versions

  • aziot-edged [run iotedge version]: iotedge 1.1.3
  • Edge Agent [image tag (e.g. 1.0.0)]: 1.2.0
  • Edge Hub [image tag (e.g. 1.0.0)]: 1.2.0
  • Docker/Moby [run docker version]: 20.10.6+azure

Note: when using Windows containers on Windows, run docker -H npipe:////./pipe/iotedge_moby_engine version instead

Logs

edge-hub logs

2021-06-24 08:30:10 +00:00 Starting Edge Hub
Jun 24 08:30:10.422  INFO watchdog: Starting Watchdog
Jun 24 08:30:10.422  INFO watchdog: Registering shutdown signal listener
Jun 24 08:30:10.425  INFO watchdog::child: Launched Edge Hub process with pid 10
Jun 24 08:30:10.429  INFO watchdog::child: Launched MQTT Broker process with pid 12
<6> 2021-06-24 08:30:10.444 +00:00 [INF] [mqttd::app::edgehub] - loading settings from a file /app/mqttd/broker.json
<6> 2021-06-24 08:30:10.448 +00:00 [INF] [mqttd::app::edgehub] - loading state...
<6> 2021-06-24 08:30:10.449 +00:00 [INF] [mqtt_broker::persist] - loading state from file /tmp/mqttd/state.dat.
Error: failed to deserialize state

Caused by:
    io error: failed to fill whole buffer
2021-06-24 08:30:10.626 +00:00 Edge Hub Main()
Jun 24 08:30:11.430  INFO watchdog::child: MQTT Broker process has stopped
Jun 24 08:30:12.426  INFO watchdog::child: Terminating Edge Hub process
Jun 24 08:30:12.426  INFO watchdog::child: Sending SIGTERM signal to Edge Hub process
Jun 24 08:30:13.426  INFO watchdog: Successfully stopped Edge Hub process
Jun 24 08:30:13.426  INFO watchdog: Successfully stopped MQTT Broker process
Jun 24 08:30:13.427  INFO watchdog: Stopped Watchdog process

Additional Information

The issue is gone after edgeHub container is deleted.

@kvasdopil
Copy link
Author

Not sure if the issue has anything to do with edgeHub storage not being persisted, at a first glance it looks like that would have made things worse as the state would be corrupted permanently and removing the container would not help.

@dmolokanov
Copy link
Contributor

Hi @kvasdopil ! Looks like we have a potential bug here. Thanks for sharing this!

@dmolokanov dmolokanov added bug Something isn't working area:mqtt MQTT broker labels Jun 24, 2021
@dmolokanov dmolokanov self-assigned this Jun 25, 2021
@github-actions
Copy link

This issue is being marked as stale because it has been open for 30 days with no activity.

@pmzara pmzara added the 1.1.3 Targeted for 1.1.3 release label Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.1.3 Targeted for 1.1.3 release area:mqtt MQTT broker bug Something isn't working customer-reported iotedge no-issue-activity
Projects
None yet
Development

No branches or pull requests

3 participants