Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sketch stops updating AWS IoT after some period of time #55

Open
eczajk1 opened this issue Sep 7, 2017 · 9 comments
Open

Sketch stops updating AWS IoT after some period of time #55

eczajk1 opened this issue Sep 7, 2017 · 9 comments

Comments

@eczajk1
Copy link

eczajk1 commented Sep 7, 2017

I have been working with a sketch that reads from some different sensors and reports values to AWS using Thing Shadows. Things seem to work up until some point when the sketch seems to lose its ability to do updates. I have had a hard time pinpointing the issue, so I tried to use a slight modification of ThingShadowEcho example, where the loop function looks like this:

void loop() {

  if(success_connect) {

    if(myClient.yield()) {
      #ifdef AWS_IOT_DEBUG
        Serial.println(F("Yield failed."));
      #endif
    }

    char JSON_buf[150];
    String payload = "{\"state\":{\"reported\":{";
    float current_stage = (rand() % 150 + 1) / 10.0;
    payload += "\"stage\": " + String(current_stage, 1);
    payload += "}}}";
    payload.toCharArray(JSON_buf, 150);

    #ifdef AWS_IOT_DEBUG
      Serial.println("stage: " + String(current_stage, 1));
    #endif

    IoT_Error_t shadowUpdateError = myClient.shadow_update(AWS_IOT_MY_THING_NAME, JSON_buf, strlen(JSON_buf), NULL, 5);

    delay(1000);
  }
}

Things mostly work. Over the period of several hours I do get a handful (9) 'Yield failed.` errors. It is hard for me to tell from the logs, but I believe the shadow updates stopped working somewhere in the vicinity shown below.

Can you offer any pointers on why this would be happening? Am I just trying to do updates too often? If so, is there a recommended minimum interval?

DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:AWSIoTPythonSDK.core.protocol.mqttCore:Try to put a publish request 41 in the TCP stack.
DEBUG:AWSIoTPythonSDK.core.protocol.mqttCore:Publish request 41 succeeded.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:AWSIoTPythonSDK.core.shadow.deviceShadow:shadow message clientToken: centuryeng_headquarters_3_0_37_vnidc
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: SU T Size: 4
DEBUG:AWSIoTPythonSDK.core.shadow.deviceShadow:Token is in the pool. Type: accepted
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: z
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: Z T Size: 3
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: y
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:No more messages for yield. Exiting writeToExternalYield.
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: z
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: Z T Size: 3
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: y
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:No more messages for yield. Exiting writeToExternalYield.
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: z
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: Z T Size: 3
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: y
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:No more messages for yield. Exiting writeToExternalYield.
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: z
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: Z T Size: 3
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: y
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:No more messages for yield. Exiting writeToExternalYield.
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: z
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: Z T Size: 3
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: y
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:No more messages for yield. Exiting writeToExternalYield.
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: z
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: Z T Size: 3
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: y
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:No more messages for yield. Exiting writeToExternalYield.
DEBUG:comm.serialCommunicationServer:Clear internal list. Size: 0
DEBUG:comm.serialCommunicationServer:Accept-timer starts, with acceptTimeout: 10 second(s).
DEBUG:comm.serialCommunicationServer:1 lines to be received. Loop begins.
DEBUG:comm.serialCommunicationServer:Received: 1/1 Message is: z
DEBUG:comm.serialCommunicationServer:Finish reading from remote client. Accept-timer ends.
DEBUG:comm.serialCommunicationServer:Updated serialCommunicationServer internal protocolMessageQueue by inserting a new message. Size: 1
DEBUG:comm.serialCommunicationServer:Send through serial to remote client: Z T Size: 3
@liuszeng
Copy link
Contributor

liuszeng commented Sep 7, 2017

Hi @eczajk1 ,

Thank you very much for providing the information.

From the log you provide, it seems that there is no new incoming messages for the yield call to dispatch, which means there is no incoming messages from the subscription (shadow/non-shadow). Can you print out the received response on 32u4 from AR9331 when you are experiencing the yield failure? I would like to exclude the possibility of a serial1 communication issue.

Thanks,
Liusu

@eczajk1
Copy link
Author

eczajk1 commented Sep 12, 2017

I am not certain what you are asking for here. Here is a snippet from the Serial monitor:

stage: 7.1
stage: 1.9
stage: 1.1
stage: 1.5
stage: 2.8
Yield failed.
stage: 4.4
stage: 4.4
stage: 12.6
stage: 0.8
stage: 3.8

The AR9331 log output pasted previously was logged using this guidance this message.

@liuszeng
Copy link
Contributor

Hi @eczajk1 ,

What I mean is the output from the Serial1 connection from AR9331 to ATmega32u4. You can find the details instructions here: #47

Add a Serail.println(rw_buf) in the SDK after this line and see if there is any extra output in your Arduino IDE Serial monitor.

Basically I would like to know what gets transmitted over the Serial1 connection to ATmega32u4 that results in a yield failure.

Thanks,
Liusu

@eczajk1
Copy link
Author

eczajk1 commented Sep 13, 2017

Thanks! But, I'm not seeing anything extra:

Y F: No messages.
stage: 14.2

Z T

Y F: No messages.
stage: 5.7

Z T

Y F: No messages.
stage: 9.1

Z T

Y F: No messages.
stage: 0.4

Z
Yield failed.
stage: 14.5

Z T

Y F: No messages.
stage: 13.3

Z T

Y F: No messages.
stage: 10.9

Z T

Y F: No messages.
stage: 13.6

Z T

Y F: No messages.
stage: 3.2

@liuszeng
Copy link
Contributor

Hi @eczajk1 ,

Thank you very much for providing the information.

If you take a look at the lines near where "Yield failed" happens, you can find the following:

Z
Yield failed.
stage: 14.5

The first inter-chip protocol seems to be broken, resulting in a yield failure. SDK logic details can be found here:
https://github.com/aws/aws-iot-device-sdk-arduino-yun/blob/master/AWS-IoT-Arduino-Yun-Library/aws_iot_mqtt.cpp#L373

This could be caused by byte loss in serial1 communication. Provided that yield is a frequently invoked call. What we can do here is just simply retry "contacting" the linux chip.

Apart from this, what is more interesting is that, on successful yield attempts, there is no further messages routed back to 32u4. From the logs on Python runtime that you previously attached, there is no explicit incoming messages from the cloud being captured as well. I would suspect that the client might get disconnect or the subscriptions some how get lost.

To further detect the issue, you can utilize the AWS IoT Lifecycle events to detect if the client is connected/disconnected/subscribed. I would like to confirm if there is any silent disconnect happened when you were experiencing the issue.

Thanks,
Liusu

@eczajk1
Copy link
Author

eczajk1 commented Sep 14, 2017

What we can do here is just simply retry "contacting" the linux chip.

Are you saying that this is a technique that could be implemented in the Sketch itself?


I subscribed to the Lifecycle Events using the AWS IoT MQTT client. But, so far, I have not seen any unexpected disconnected events even while Yield failed errors are logged. I also saw this other one:

SU T

Z T

Y 1 0 JSON-267

Y F: No messages.
stage: 1.3





SUFF: Unknown error.

Z T

Y F: No
stage: 7.4
ssages.
u



SU T

Z T

Y F: No messages.
stage: 8.4

@eczajk1
Copy link
Author

eczajk1 commented Sep 20, 2017

I have tried listening on on Lifecycle events several times in the last week. But I am still experiencing the same issue while not seeing ANY lifecycle events showing up. I only see them when the sketch is first started.

Any advice on actions I can take to further diagnose this issue?

@hedeshianaws
Copy link

@eczajk1 Are you still facing this issue?

@eczajk1
Copy link
Author

eczajk1 commented Oct 24, 2017

Yes, I was not able to resolve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants