You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The sequence of events appears to be as follows, these events occur in the scanning of an ALDB_i2 link table:
MH issues a read command for link address 0FE7
ACK Received
Link Data for 0FE7 is received
MH issues a read command for link address 0FDF
Ack Received
Link Data for 0FE7 is received <-- duplicate out of order packet
6a. Because of the long delay, this packet is not caught as a duplicate packet
This message is passed to the ALDB_i2 Link parser which perceives this as a corrupt response.
The parser then tries to queue a request to read link address 0FDF again
MH catches this as an attempt to queue a command already in the queue and ignores it
Another ACK is received
10a. It is unclear what this is in response to
The ALDB_i2 Link parser ignores the ACK because it can't correlate it to a sent message
Link Data for 0FDF is finally received
The ALDB_i2 Link parser ignores the link data claiming that an ACK was not received.
At this point, the queue for the device stalls and nothing else happens.
Quick Diagnosis:
At step 8, I don't think the parser should be trying to queue a new message request. Instead, the parser should just fail to acknowledge receiving anything, this should result in the message handler sending a message retry in its normal course of action.
Steps 11 and 13. It is unclear to me why the parser initially claims that it cannot correlate the ACK to anything, but then subsequently claims that an ACK was never received that it was expecting.
It is also unclear why the queue timer is being cleared and never reset. This is what causes the entire process to stall.
The text was updated successfully, but these errors were encountered:
on_read_write_aldb now returns a 1/0 corresponding to whether the current message should be cleared.
When a bad message arrived, on_read_write_aldb attempted to requeue the message that was currently pending. However, _process_message did not clear the pending message until after this routine was run. As a result, a new message was not queued because it was duplicative, but then the current message was cleared. This resulted in stalling the message queue.
Fixes bug hollie#258
on_read_write_aldb now returns a 1/0 corresponding to whether the current message should be cleared.
When a bad message arrived, on_read_write_aldb attempted to requeue the message that was currently pending. However, _process_message did not clear the pending message until after this routine was run. As a result, a new message was not queued because it was duplicative, but then the current message was cleared. This resulted in stalling the message queue.
Fixes bug hollie#258
krkeegan
added a commit
to krkeegan/misterhouse
that referenced
this issue
Sep 28, 2013
Missed one instance in which the queued message should not be cleared.
Should not be cleared on an unhandled mem action either.
Further Fix to hollie#258
This issue was discovered by @pmatis.
The sequence of events appears to be as follows, these events occur in the scanning of an ALDB_i2 link table:
6a. Because of the long delay, this packet is not caught as a duplicate packet
10a. It is unclear what this is in response to
Quick Diagnosis:
The text was updated successfully, but these errors were encountered: