-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boards/cc2538/radio: networking has high losses #5786
Comments
Hi @PeterKietzmann, I will try to reproduce, thanks for testing! mind pointing me to steps to reproduce? |
|
A quick sweep & repeat (5-7 times each step) without too much difference:
Strangely packet size of 369-380 always yielded 100% packet loss (after 10+ runs, note for 368 the packet loss is 0%), whereas larger packet size yields in packet loss but not 0%. I will keep testing, and try to test also with another CC2538-based platform |
Wow... bizarre. I can reproduce this too (well, 99.9% loss). Do we know if the same thing occurs on other platforms? |
It seems that introducing a slight delay in the transmission improves things significantly. The delay only needs to be very slight - enabling Why this helps, though, I have no idea. |
Have you tried #5804 with |
I made the same observations, see #5840 for some debug output. I saw 100% packet loss with ping6 and 256B payload from |
For those testing with large payloads over a layer 3 protocol, please also see #5803 |
@alignan no, I haven't tested this yet. One thing I can't currently test, but would like to, is whether pinging cc2538 from another platform exhibits the same problem. i.e. do other drivers send their packets as quickly as the cc2538 does? I know that RIOT's cc2538 driver is capable of sending consecutive packets very quickly - it doesn't wait for transmission of the first packet to complete before starting to process the next one, because it doesn't need to. So if other drivers are not capable of sending packets this fast, and therefore not provoking the problem, one workaround might be to wait for transmission to complete before starting to process the next packet. However, the real underlying issue here seems to be the cc2538 driver not handling multiple packets in the RX FIFO (I think the cc2538 hardware is quite unique in its ability to do this?) To me, this seems like a non-trivial problem to solve, and I would welcome any ideas on how to handle it properly rather than introducing artificial delays. |
Another observation that causes kernel panic: I have a
|
@aeneby Why the RX FIFO is unconditionally flushed? At the end of the "_recv" function when the packet is copied into the buffer there could be another packet into the FIFO. |
Could this perhaps be caused by the Linux kernel on the RPi dropping the packets? RIOT actually sends out incorrectly constructed packets for intra-PAN communication, and my observation has been that Linux will drop these. If that's the cause, however, it seems strange that it would work for the
I'd be interested to know if the packets sent out from the @LucaZulberti, you are 100% correct. But the reason it hasn't been fixed is that I don't think the solution is quite as obvious as it first seems. If we let the RX FIFO overflow before we flush it, then we potentially lose a packet anyway, right? The one which didn't fit into the buffer, because we didn't flush it earlier. So there are some corner cases to consider. Unfortunately I will not have much time personally to look at this for the next several weeks, but I would be happy to review any proposed solutions. |
@alignan, @aeneby in #5869 we completely disabled ACK interrupts and in that scope I realized that the cc2538 does not handle ACKs and retransmissions in hardware. Is that correct?
If my above assumption applies and in case we will implement ACK handling as well as retransmission handling in software, it might be reasonable to slow down the sender by waiting for an ACK (even if this is not with regards to the highest performance) |
@PeterKietzmann yes that is correct (although automatic sending of acknowledgements is supported in hardware)
Hmm but that would imply that we always need to have the ACK_REQ (acknowledgement request) flag set in the frame header of every sent packet, or risk running into the same problem again? I notice that the cc2538 driver in a certain other IoT platform (hint: starts with C, rhymes with "non-sticky") waits for transmission to complete before returning from the send function. But as far as I'm concerned this should not be necessary, which is why I did not implement it. So the question is, are we really too fast at sending, or just too slow at receiving? [edit] Clarification of ACK_REQ bit. |
@aeneby from what I saw during my (interop) tests with other boards, i.e. When I enabled debugging in the So your suggestion to wait until transmission of one frame/fragment returns might be a good idea, to solve this in general. Nevertheless, I think @PeterKietzmann is right too, if the device cannot handle ACKs and retrans, the driver should implement those in software. Currently, with #5869 in place we can only send unfragmented frames successfully everything else is a mess - this needs to be fixed. On the other hand, we should also look into the receiver function of the other boards, maybe we can speed them up a bit as well?! |
I don't know about other boards/MCUs unfortunately, since all I have is ones based on the cc2538. I suppose in the interests of interoperability we could introduce a slight delay between sending packets; at this point optimization seems somewhat premature. Are you able to confirm that something like this (untested) patch resolves the issue? As for handling ACKs and CSMA etc, wouldn't doing this in the driver be a duplication of this effort (and similar others)? Or am I misunderstanding something? |
@aeneby your untested patch is now tested by me 😄 and it works! I now can send fragments, that is PINGs with large payloads, successfully between However, @PeterKietzmann said: ACKs and retransmissions are currently missing and have to be implemented in/by the driver. |
@PeterKietzmann fine with me to generalize such functionality, so we should move forward there, too. However, @aeneby will you provide a PR with your patch or shall I do so? Further, I'd like to get those two |
Unfortunately, I have a very low bandwidth now to do any test/changes until next week, so any help on those is welcome, thanks! |
WRT ACK/CSMA, I don't know how far off a MAC protocol capable of doing this is, but it would certainly save duplication of effort if it was possible for every driver to utilise the same code. In saying that, however, the |
@aeneby and @PeterKietzmann, I tested and merged #5915 - I think we can close this one then? |
I will close it an set the memo label cause there is still room for improvement! |
@PeterKietzmann, yes you're right, but thats the case with (m)any solutions/PRs - we might never close any issue then 😬 |
Pinging several times between two remote revision A nodes (
remote-reva
) with bigger payloads (e.g. 200, 500, 1000 Bytes) leads to high losses.The text was updated successfully, but these errors were encountered: