Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear how Router Solicitations are (or should be) handled #15926

Closed
tinstructor opened this issue Feb 4, 2021 · 17 comments
Closed

Unclear how Router Solicitations are (or should be) handled #15926

tinstructor opened this issue Feb 4, 2021 · 17 comments
Assignees
Labels
Area: network Area: Networking Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: question The issue poses a question regarding usage of RIOT

Comments

@tinstructor
Copy link

tinstructor commented Feb 4, 2021

@miri64 I was scrolling through RFC 6775 alongside the RIOT codebase again when I started wondering how a router handles an incoming RS who's IP source address is already cached (i.e., there's a pre-existing NCE) but for which the NCE stores a link-layer address that differs from the one supplied in the RS's SLLAO. According to RFC 6775, in the aforementioned case, the router may not update the pre-existing NCE (whereas RFC 4861 would). However, after looking at the RIOT codebase, it seems to me that the router would then "respond" with an RA using the pre-existing NCE, and hence send the RA to the wrong link-layer destination. However, the whole code flow took me through so many different files that I could have easily overlooked something. If so, I'll immediately close this issue.

The most important files and lines seem to be:

#if IS_ACTIVE(CONFIG_GNRC_IPV6_NIB_ARSM)
if ((nce != NULL) && (nce->mode & _NC) &&
((nce->l2addr_len != l2addr_len) ||
(memcmp(nce->l2addr, sl2ao + 1, nce->l2addr_len) != 0)) &&
/* a 6LR MUST NOT modify an existing NCE based on an SL2AO in an RS
* see https://tools.ietf.org/html/rfc6775#section-6.3 */
!_rtr_sol_on_6lr(netif, icmpv6)) {
DEBUG("nib: L2 address differs. Setting STALE\n");
evtimer_del(&_nib_evtimer, &nce->nud_timeout.event);
_set_nud_state(netif, nce, GNRC_IPV6_NIB_NC_INFO_NUD_STATE_STALE);
}

_send_unicast(pkt, prep_hdr, netif, ipv6_hdr, netif_hdr_flags);

if (gnrc_ipv6_nib_get_next_hop_l2addr(&ipv6_hdr->dst, netif, pkt,

_nib_onl_entry_t *node = _nib_onl_get(dst,
(netif == NULL) ? 0 : netif->pid);
/* consider neighbor cache entries first */
unsigned iface = (node == NULL) ? 0 : _nib_onl_get_if(node);
if ((node != NULL) || _on_link(dst, &iface)) {
DEBUG("nib: %s is on-link or in NC, start address resolution\n",
ipv6_addr_to_str(addr_str, dst, sizeof(addr_str)));
/* on-link prefixes return their interface */
if (!ipv6_addr_is_link_local(dst) && (iface != 0)) {
/* release preassumed interface */
gnrc_netif_release(netif);
netif = gnrc_netif_get_by_pid(iface);
gnrc_netif_acquire(netif);
}
if ((netif == NULL) ||
!_resolve_addr(dst, netif, pkt, nce, node)) {

if ((node->mode != _EMPTY) &&
/* either requested or current interface undefined or
* interfaces equal */
((_nib_onl_get_if(node) == 0) || (iface == 0) ||
(_nib_onl_get_if(node) == iface)) &&
ipv6_addr_equal(&node->ipv6, addr)) {
DEBUG(" Found %p\n", (void *)node);
return node;

if ((entry != NULL) && (entry->mode & _NC) && _is_reachable(entry)) {
if (_get_nud_state(entry) == GNRC_IPV6_NIB_NC_INFO_NUD_STATE_STALE) {
_set_nud_state(netif, entry, GNRC_IPV6_NIB_NC_INFO_NUD_STATE_DELAY);
_evtimer_add(entry, GNRC_IPV6_NIB_DELAY_TIMEOUT,
&entry->nud_timeout, NDP_DELAY_FIRST_PROBE_MS);
}
DEBUG("nib: resolve address %s%%%u from neighbor cache\n",
ipv6_addr_to_str(addr_str, &entry->ipv6, sizeof(addr_str)),
_nib_onl_get_if(entry));
_nib_nc_get(entry, nce);
res = true;
}

So am I missing something or should the RA instead be sent to the RS's source using the link-layer address supplied in its ARO instead of looking for a matching NCE in the neighbor cache?

@miri64 miri64 self-assigned this Feb 4, 2021
@miri64 miri64 added Area: network Area: Networking Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: question The issue poses a question regarding usage of RIOT labels Feb 4, 2021
@miri64
Copy link
Member

miri64 commented Feb 4, 2021

Mmhhh did you check, if this is actually what is happening? One could "easily" add a test case to tests/gnrc_ipv6_nib_6lo for the described scenario, to check if the result is as you described.

@miri64
Copy link
Member

miri64 commented Feb 4, 2021

If it is the case IMHO the most straight forward solution would be not to reply to the RS. However, that would mean, that the downstream node would "loose" the router at some point, potentially dropping out of the network (until the address registration timed out). Maybe this is desired, as basically the same IP from a different link-layer address could also hint at a duplicate address, i.e. we need to wait for address registration to confirm that the address changed. This could be done by the offending node, by sending out a NS with the new LL address in its SLLAO, IIRC.

@miri64
Copy link
Member

miri64 commented Feb 4, 2021

Another question this issue raises for me: According to RFC 4861 an RS MUST carry a link-local address as source address. However, for 6Lo-ND, link-local addresses do not fall under the "jurisdiction" of the address registration mechanism (they are just assumed to be unique and to be based on the EUI-64 of the device). So any NC entries on link-local addresses should be garbage collectible on the router. So why does the router handle the RS's SLLAO under 6Lo-ND conditions anyway?

@tinstructor
Copy link
Author

tinstructor commented Feb 4, 2021

Another question this issue raises for me: According to RFC 4861 an RS MUST carry a link-local address as source address.

I think you confused this requirement with the RA's source address. More specifically, for a RS RFC 4861 states:

IP Fields:
Source Address:
An IP address assigned to the sending interface, or
the unspecified address if no address is assigned
to the sending interface.

@tinstructor
Copy link
Author

Mmhhh did you check, if this is actually what is happening? One could "easily" add a test case to tests/gnrc_ipv6_nib_6lo for the described scenario, to check if the result is as you described.

I have not. I'm embarrassed to even ask, but could you do this. I'm actually new to RIOT (coming from a different OS) and I still have much to learn. Hence, if I try to "easily" do this, it would still be WIP in a month since time is a bit short right now.

@miri64
Copy link
Member

miri64 commented Feb 4, 2021

Another question this issue raises for me: According to RFC 4861 an RS MUST carry a link-local address as source address.

I think you confused this requirement with the RA's source address.

Arghs, yes. Sorry for the noise. From my memory I confused it with the RA's source address and then skimmed to fast over the RFC and confused "Source link-layer address option" with "source link-local address" 😅

@miri64
Copy link
Member

miri64 commented Feb 4, 2021

I have not. I'm embarrassed to even ask, but could you do this. I'm actually new to RIOT (coming from a different OS) and I still have much to learn. Hence, if I try to "easily" do this, it would still be WIP in a month since time is a bit short right now.

Same here [that's why I was asking in the first place and not just did it], but I try to squeeze some time for this in.

@tinstructor
Copy link
Author

tinstructor commented Feb 4, 2021

If it is the case IMHO the most straight forward solution would be not to reply to the RS. However, that would mean, that the downstream node would "loose" the router at some point, potentially dropping out of the network (until the address registration timed out). Maybe this is desired, as basically the same IP from a different link-layer address could also hint at a duplicate address, i.e. we need to wait for address registration to confirm that the address changed. This could be done by the offending node, by sending out a NS with the new LL address in its SLLAO, IIRC.

But let's say all link-layer addresses are short 16-bit addresses and even link-local unicast addresses (formed based on the link-layer address) could therefore be duplicate (still very unlikely). Link-local addresses are typically not registered with a NS but nowhere in RFC 6775 is this implicitly or explicitly forbidden. So if someone on the link did indeed register a link-local address to a router and another node forms the same link-local address on startup and subsequently tries to solicit a RA using its newly formed (and unregistered) link-local address as the RS's source IP address, then what? Even if the other node didn't register its link-local address, the router may still own a Tentative NCE (albeit for a short time). I've not even begun to contemplate how this issue affects global addresses though.

@miri64
Copy link
Member

miri64 commented Feb 5, 2021

But let's say all link-layer addresses are short 16-bit addresses and even link-local unicast addresses (formed based on the link-layer address) could therefore be duplicate (still very unlikely). Link-local addresses are typically not registered with a NS but nowhere in RFC 6775 is this implicitly or explicitly forbidden.

First of all, see also #11033 on general issues with short address in 6LoWPAN with RIOT btw. But mhhh... indeed

o In order to preserve the uniqueness of addresses (see Section 5.4
of [RFC4862]) not derived from an EUI-64, they must be either
assigned or checked for duplicates in the same way throughout the
LoWPAN. This can be done using DHCPv6 for assignment and/or using
the Duplicate Address Detection mechanism specified in Section 8.2
(or any other protocols developed for that purpose).

So we are back at square one. Without a known router, the mechanism from 8.2 (which refers to the address registration mechanism) can't be done, but the router has to assume that first come, first served (from the interpretation of RFC 6775). Maybe a good compromise here would be, that the node should assume, that if it does not find a router in X RS resends, its address is duplicate and should change its link-layer address (or renegotiate it with the PAN controller, if one exists) and try again.

@miri64
Copy link
Member

miri64 commented Feb 5, 2021

How the or if the router replies to such multicast RSs is then not that important anymore (but I guess it should not reply to prevent spamming the network)

@tinstructor
Copy link
Author

tinstructor commented Feb 8, 2021

So we are back at square one. Without a known router, the mechanism from 8.2 (which refers to the address registration mechanism) can't be done, but the router has to assume that first come, first served (from the interpretation of RFC 6775). Maybe a good compromise here would be, that the node should assume, that if it does not find a router in X RS resends, its address is duplicate and should change its link-layer address (or renegotiate it with the PAN controller, if one exists) and try again.

Either that or you go the route of 6TiSCH and derive a router's link-local address from the L2 beacon to register a node's link-local address (formed @ boot) before the node multicasts RSs using its link-local address as the IP source of the RS. Multicasting RSs is still useful in that case IMHO to have more entries in the default router list.

@miri64
Copy link
Member

miri64 commented Feb 8, 2021

Multicasting RSs is still useful in that case IMHO to have more entries in the default router list.

For that the array of default routers must be increased in size though ;-)

@tinstructor
Copy link
Author

And it would also result in multi-hop DAD traffic overhead since uniqueness of link-local addresses would only be required within a single hop anyway. This is somehow mentioned in RFC 8505. However, I think it's weird they mention it there (as a downside of RFC 6775 that is) since there is literally zero reference to explicitly registering link-local addresses in RFC 6775.

@tinstructor
Copy link
Author

tinstructor commented Feb 8, 2021

For that the array of default routers must be increased in size though ;-)

I've not looked into this yet but I guess RIOT keeps only 1 default router (initially based on the first RA received) and (later on) changes its default router based on the outcome of RPL preferred parent selection? If so, how does it keep track of the candidate parent set? Besides, I figure the candidate neighbor set is simply the neighbor cache (or at least, all entries for routable addresses), except the entries that are in an UNREACHABLE NUD state?

@miri64
Copy link
Member

miri64 commented Feb 8, 2021

I've not looked into this yet but I guess RIOT keeps only 1 default router (initially based on the first RA received) and (later on) changes its default router based on the outcome of RPL preferred parent selection?

Yepp, basically the new "primary default router" is the only one. However, using CONFIG_GNRC_IPV6_NIB_DEFAULT_ROUTER_NUMOF one can easily have more default routers. The primary one will either be the one determined by RPL or by the algorithm described in RFC 4861 (if no routing protocol is present).

@miri64
Copy link
Member

miri64 commented Feb 8, 2021

If so, how does it keep track of the candidate parent set?

I think for that you need to increase the maximum number of default routers.

Besides, I figure the candidate neighbor set is simply the neighbor cache (or at least, all entries for routable addresses), except the entries that are in an UNREACHABLE NUD state?

Not being that deep into the details of the RPL implementation, but I believe this could also refer to what @benpicco is providing in #14448.

@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 22, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
@maribu
Copy link
Member

maribu commented Jan 5, 2023

This has been stale for ages, I'm closing this now. If there still is more to it, please reopen :)

Btw: We in the meantime got a forum since when the issue was opened. I haven't read the whole discussion, but this seems to be more opened to ask a question rather than reporting an issue. Now that we have the forum, I'd say this would be the more appropriate place to continue the discussion (unless I misjudge the nature of this issue by only briefly looking into it).

@maribu maribu closed this as completed Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: question The issue poses a question regarding usage of RIOT
Projects
None yet
Development

No branches or pull requests

4 participants