-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sequential PLM Scenes May be Ignored #26
Comments
Previously, insteon cleanup messages received by MH, would cause MH to clear the active message withoug confirming if the cleanup message corresponded to the active message. If multiple PLM scenes were sent sequentially, a cleanup message from one scene may have caused MH to clear out the following scenes command. This patch only clears the active message if the cleanup message matches the active message.
Great catch, I think I've seen this too when sending two commands in a row to the same device. Both the existing code and the proposed fix concern me. I'm concerned that the existing behavior tried to remove a message that hasn't been sent. IMO, this should never happen and should be part of the check. At the same time I agree that the response matching logic needs to be stronger (to cover my case). I think this should be fixed in two ways. We could add (if it doesn't already exist) a "sent" flag in the insteon message object that is false initially and set when the message is sent. Then the insteon message object should have a match_ack() function. This function returns true if the ack matches the message and "always" returns false if the "sent" flag is false (to cover your case). The matching logic in the match_ack() function is otherwise very similar to what you wrote. We could then call the match_ack() function with the ack message while scanning the message stack. Would you be willing to try to fix it this way? I don't mind giving it a go otherwise. |
I agree that checking to see if a message has even been sent would be a nice addition. However, at the moment, it is not on the top of my list to add to the code. I am not the world's greatest perl programmer and I am struggling as it is to figure out the structure of greg's design. Even without checking if a message has been sent, I think my proposed code still works. In order for a message to be cleared, it has to be from the same device, for the same group, with the same command. A message which has not been sent which matches these criteria would be a duplicate and unnecessary anyways. A quick solution I can think of, is to check the send_attempts. However, send_attempts iterates when the PLM reports a busy code "15". So a number greater than 0 is not necessarily a sign that a message was sent. We could alter this behavior, but it could result in an infinite loop on the off chance that the PLM continually fails and responds only with busy codes. mstovenour, if you have the ability to add your suggested code, please by all means do so. Otherwise, I will mark this as an issue for me to come back to in the future. |
I'm still wading through the code to get a better feel for how it works and haven not come to a well formed conclusion for this issue. I will admit that the code you introduced is worse case benign; I don't see any harm in the changes. However I still think maybe something more interesting is going on. The 2412U PLM dev guide from 2007 (i.e. covers i1 and i2 but not i2CS) uses the term "all-link group" for a mh "scene". Apparently there are a number of things that happen when we ask the PLM to send an all-link group command. 1) PLM sends an all-link group broadcast towards the responders. 2) the PLM then sends individual clean up messages to each linked responder. 3a) each linked device that gets the all-link clean up will send an all-link clean up ack (this arrives at MH as an insteon message plm command 0x50). 3b) for each responder that doesn't respond to the PLM's all link clean up, the PLM will send a All-link clean up failure report (0x56) to MH. 4) Finally the PLM will send an all-link cleanup status report (0x58) to MH with either ACK(06) or NACK(15). ACK indicates that the clean up completed. NACK indicates that the clean up was aborted due to other insteon traffic. The interesting part is that the code to process this last PLM command has a semantic error. The author tried to use == to compare a string. I'm not sure if this is part of your issue or not. I'll come back to this once I get scenes working with my I2CS devices. Right now I can't test any of the PLM all-link logic. |
Good catch Michael, I hadn't noticed the attempt to compare a string with ==. I tested it out and this by itself doesn't solve the above issue. This is because, best I can tell the message that I found to be the cause of the problem is interpreted as a "standard insteon message" and is handled prior to reaching this logic. Best I can tell the short all_link_clean_status message is simply dumped in the current code. Correcting the logic to use "eq" doesn't seem to have any effect because the current code doesn't really do much with the cleanup messages. I think Greg intended on coming back to this issue and never did. |
…df74f1fb Fix for Issue #26. Mered after krkeegan has been running this fix for 3 weeks without issues.
This is a weird one, and I am still diagnosing it, but this is what I believe is happening so far.
Issue:
If two PLM scene commands are issued sequentially, the second command may never be sent. In this scenario, each Scene controls only one device. Specifically, they are surrogate scenes for KPL buttons. The basic print log shows something as follows:
PLM_Scene_1->set('on')
Received acknowledgement from PLM_SCENE_1
PLM_Scene_2->set('on')
Received acknowledgement from PLM_SCENE_1 (This is not a typo, the print log shows the first scene as acknowledged twice.)
Current Theory:
Setting the insteon debug level to 3 provides some additional detail. If left at the default settings, it looks like the PLM_Scene_1 command is issued and an acknowledgement is received. MH then tries to send the PLM_Scene_2 command but receives a "PLM is extremely busy" error 15 and claims that the command will be retried in 1 second.
While waiting to retry the command, MH receives a command from the PLM. The PLM reports this as being a second acknowledgement from PLM_SCENE_1. (I suspect at this point that MH erroneously deletes the pending command for PLM_Scene_2 that is waiting, believing that this command has been acknowledged).
If I up the Insteon_xmit setting to .7 seconds. Everything seems to work fine. The Insteon:3 debug log shows the following general sequence:
PLM_Scene_1->set('on')
Received acknowledgement from PLM_SCENE_1
Received All_Link Cleanup success from PLM_SCENE_1
PLM_Scene_2->set('on')
Received acknowledgement from PLM_SCENE_2
Received All_Link Cleanup success from PLM_SCENE_2
So I am not an Insteon guru, but it looks like we should be expecting two acknowledgments to PLM Scenes (at least this specific type of scene). While the first acknowledgment is specific to the one linked device. I believe the second acknowledgement is basically saying "All of the linked devices in this scene have reported in successfully."
I suspect that with the Insteon_xmit setting at the default level, that the All-Link Cleanup success message is why the PLM is busy when we attempt to send the PLM_Scene_2 command.
As noted above, I further suspect that at the default setting, the All-Link Cleanup success message causes MH to delete the pending PLM_SCENE_2 command.
Solutions:
Status:
I am still working on diagnosing where in the code this all happens, if anyone smarter than me knows where to look or has some suggestion, please let me know.
The text was updated successfully, but these errors were encountered: