Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: bt: fix 6lowpan over BLE regression #31075

Merged
merged 5 commits into from
Jan 14, 2021

Conversation

fabiobaltieri
Copy link
Member

@fabiobaltieri fabiobaltieri commented Jan 2, 2021

6lowpan over BLE stopped working since multicast filtering was implemented. Seems like Address Registration Option implementation is required for this to work without solicit node multicast, but as that is not currently implemented, nodes are unresponsive.

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Could you add a test case for this change so that we will not hit this issue again in the future if the code is changed. So a simple test in tests/net/ipv6/src/main.c, that would check that we do not drop solicited node mcast packet, would be enough.

@fabiobaltieri fabiobaltieri marked this pull request as draft January 4, 2021 11:54
@fabiobaltieri
Copy link
Member Author

fabiobaltieri commented Jan 4, 2021

Actually after further investigation I realized that the current behavior has been set explicitly in

66244a0 net: if: No need to always join solicit node mcast group

the multicast stack handles solicited node address correctly when initialized to do so with join_mcast_solicit_node().

Dropping NET_L2_MULTICAST_SKIP_JOIN_SOLICIT_NODE from net_bt_flags() seems to make the ipsp example work again, so that would be an option, but that should not be needed according to https://tools.ietf.org/html/rfc7668#section-3.2.2 (mentioned in the patch above). What I don't understand now is why the Linux host is using solicited node for neighbor discovery and if there is a way of setting up the link to make this work correctly.

Marked the PR as draft while investigating further.

@fabiobaltieri
Copy link
Member Author

fabiobaltieri commented Jan 4, 2021

I think this is what's happening: according to rfc7668, 6lowpan over BLE does not need solicited node frames, and neighbor discovery happens with NS messages with Address Registration Option (described in rfc6775), which supposedly keeps the neighbor cache hot on the border router:

Registration:
The process during which a LoWPAN node sends a Neighbor
Solicitation message with an Address Registration Option to a
router creating a Neighbor Cache Entry (NCE) for the LoWPAN node
with a specific timeout. Thus, for 6LoWPAN routers, the Neighbor
Cache doesn't behave like a cache. Instead, it behaves as a
registry of all the host addresses that are attached to the
router.

Now the problem is that there's no implementation for this in the zephyr ipv6 stack as far as I can tell, and it seems like ARO is not supported in Linux either (that would have to be defined here as option 33).

So since the node does not send NS ARO messages, Linux tries to do normal host initiated neighbor discovery, which fails because the node stack is discarding those frames.

rfc7668 also states that link local addresses do not have to be registered

A Bluetooth LE 6LN MUST NOT register its link-local address.

so I don't quite get the full story of how link-local address are supposed to be resolved.

Either way, this does not work with the current code base without address registration plus solicited node filtering, and I think that even if we had address registration, it would not work without explicit handling in Linux.

@jukkar: would it make sense to remove NET_L2_MULTICAST_SKIP_JOIN_SOLICIT_NODE from the 6lowpan bluetooth stack as a workaround until rfc6775 is fully implemented? I'll update the PR with that in the meantime, just in case someone else wants to test it.

@fabiobaltieri fabiobaltieri changed the title net: ipv6: do not drop solicited node addressed frames net: bt: fix 6lowpan over BLE regression Jan 4, 2021
@carlescufi carlescufi requested a review from Vudentz January 5, 2021 17:44
@fabiobaltieri fabiobaltieri marked this pull request as ready for review January 5, 2021 20:00
@jukkar
Copy link
Member

jukkar commented Jan 8, 2021

would it make sense to remove NET_L2_MULTICAST_SKIP_JOIN_SOLICIT_NODE from the 6lowpan bluetooth stack as a workaround until rfc6775 is fully implemented? I'll update the PR with that in the meantime, just in case someone else wants to test it.

Sure, that seems to be the only usable workaround for this issue atm.

@jukkar
Copy link
Member

jukkar commented Jan 11, 2021

In your sample commit message, the example list of multicast addresses are

IPv6 multicast addresses (max 8):
        ff02::1
        ff02::1:ff01:c41e
        ff02::1:ff00:1
        ff02::1

Here ff02::1 is mentioned twice. Is this a typo or was it really like this? The address should be there once and it is a bug if mentioned multiple times.

@fabiobaltieri
Copy link
Member Author

Not a typo, it does show up twice, seems like init_app() of samples/bluetooth/ipsp/src/main.c is calling net_if_ipv6_maddr_add() on that address directly, which bypasses the duplicate check (that's in net_ipv6_mld_join).

Removing it or changing it to net_ipv6_mld_join makes the duplicate go away. Don't see why that address should be registered by the application anyway, I'll add a third patch to drop if entirely from the sample you agree with that.

As for the API, should the duplicate detection be moved from net_ipv6_mld_join to net_if_ipv6_maddr_add?

@jukkar
Copy link
Member

jukkar commented Jan 11, 2021

Removing it or changing it to net_ipv6_mld_join makes the duplicate go away. Don't see why that address should be registered by the application anyway, I'll add a third patch to drop if entirely from the sample you agree with that.

Yes, please add a 3rd patch for that.

As for the API, should the duplicate detection be moved from net_ipv6_mld_join to net_if_ipv6_maddr_add?

Sounds like an error if the same address is there multiple times. So please move the check to when the address is added.

@github-actions github-actions bot added the area: Tests Issues related to a particular existing or missing test label Jan 12, 2021
@fabiobaltieri fabiobaltieri requested a review from nashif as a code owner January 12, 2021 21:35
@fabiobaltieri
Copy link
Member Author

Alright, I've updated the PR to add a patch to check for duplicate addresses to net_if_ipv6_maddr_add(), removed the call from the examples and while I was on it I updated the sample documentation to mention the debug config.

Tested the whole thing again, seems to be working fine, the address list looks like this now:

IPv6 multicast addresses (max 4):
ff02::1
ff02::1:ff01:c41e
ff02::1:ff00:1

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Fabio! Minor nit, looks good otherwise.

samples/bluetooth/ipsp/README.rst Show resolved Hide resolved
6lowpan over BLE should work without solicit node multicast messages
according to RFC7668[1], but that requires Neighbor Solicitation with
Address Registration Option, which is currently not implemented in
either Zephyr or Linux. This is causing the router to fallback to normal
neighbor solicitation based discovery, but the NS frames are being
discarded in the host stack because the solicit node multicast groups
are not registered.

This drops the NET_L2_MULTICAST_SKIP_JOIN_SOLICIT_NODE as a workaround
and adds a TODO about it.

[1] https://tools.ietf.org/html/rfc7668#section-3.2.3

Signed-off-by: Fabio Baltieri <[email protected]>
This sample is trying to register four multicast addesses:

uart:~$ net iface
...
IPv6 multicast addresses (max 8):
        ff02::1
        ff02::1:ff01:c41e
        ff02::1:ff00:1
        ff02::1

Increasing CONFIG_NET_IF_MCAST_IPV6_ADDR_COUNT to 4 to make neighbor
discovery work correctly.

Signed-off-by: Fabio Baltieri <[email protected]>
Add a check to stop a multicast address to be registered multiple times.
This can happen if the application is using net_if_ipv6_maddr_add()
directly.

Tested on the existing bluetooth/ipsp sample:

<wrn> net_if: Multicast address ff02::1 is is already registered.

Signed-off-by: Fabio Baltieri <[email protected]>
The IPSP example code is explicitly registering the all local-link nodes
address (ff02::1). This is currently already registered by the normal
IPv6 stack at interface initialization, so doing it in the application
is redundant.

Signed-off-by: Fabio Baltieri <[email protected]>
Add an example referring to the (already existing) prj_dbg.conf config.

Signed-off-by: Fabio Baltieri <[email protected]>
@fabiobaltieri
Copy link
Member Author

Updated with a couple of checkpatch fixes.

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jhedberg jhedberg merged commit fe746f9 into zephyrproject-rtos:master Jan 14, 2021
@fabiobaltieri fabiobaltieri deleted the ipsp-fix branch January 14, 2021 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Bluetooth area: Documentation area: Networking area: Samples Samples area: Tests Issues related to a particular existing or missing test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants