Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add package gluon-radv-priorityd #718

Closed
wants to merge 1 commit into from
Closed

Add package gluon-radv-priorityd #718

wants to merge 1 commit into from

Conversation

jplitza
Copy link
Member

@jplitza jplitza commented Apr 3, 2016

This package tries to prioritize the router advertisements of the gateway selected by the B.A.T.M.A.N. advanced gateway selection. It does this by inserting rules into the firewall to hand all router advertisements via the NFQUEUE mechanism to a userspace daemon, which then examines them and changes the preference field to "high" if appropriate.

@BarbarossaTM
Copy link

Hi @jplitza

Awesome idea!

It might have one problem though:

Consider the setup gw <-> n1 <-> n2 <-> n3 <-> client or

gw <-> n1 <-> n2 <-> n3 <-> n4 -> gw2
                     |
                   Client

To my understanding of BATMANs magic it would be possible that in both examples the client could get multiple high RAs as every node decides which it considers it's best gateway on it's own. So your goal would be circumvented.

How about setting the ebtables filter to only match and handle packet leaving for br-client? That way you would only fiddle around with RAs on the last node in any possible chain.

Cool stuff though!

Kind regards
Max (Barbarossa in IRC)

@jplitza
Copy link
Member Author

jplitza commented Apr 3, 2016

That's not a problem, as the packet is forwarded from node to node inside batman-adv. Sadly, neither ebtables nor ip6tables can modify them until they come out of bat0 and enter br-client - which only happens at the last node, the one to which the client is connected.

@BarbarossaTM
Copy link

That is an absolutely fair point and I rest my case :-)

Thanks!

@jplitza
Copy link
Member Author

jplitza commented Apr 4, 2016

Whoops, just noticed that there was one debugging change left that I forgot to remove (the interval in which the gateways file is re-read). Done that now and made it a constant.

@jplitza jplitza added the 0. type: enhancement The changeset is an enhancement label Apr 4, 2016
@neocturne neocturne added this to the 2016.2 milestone Apr 4, 2016
@jplitza jplitza force-pushed the radv-priorityd branch 2 times, most recently from 8bc76be to 800eb93 Compare April 4, 2016 14:20
This package tries to prioritize the router advertisements of the
gateway selected by the B.A.T.M.A.N. advanced gateway selection. It does
this by inserting rules into the firewall to hand all router
advertisements via the NFQUEUE mechanism to a userspace daemon, which
then examines them and changes the preference field to "high" if
appropriate.
@jplitza
Copy link
Member Author

jplitza commented Apr 4, 2016

And another round of fixes for init script and firewall configuration. Apparently I never tested if this really works if installed from scratch, only with manual adjustments.

One caveat, especially during testing: Because the daemon is caching the chosen gateway for 60 seconds, it needs roughly that time to really modify the first packet, because apparently batman-adv needs some moments to choose a gateway, enough for the first RA to come in and trigger the lookup. It just took me half an hour or so figuring that out, as I was thinking the daemon didn't work at all.

Also, I'm most nervous about the fact that, due to net.bridge.bridge-nf-call-ip6tables=1, every IPv6 packet that is bridged now runs through the whole firewall. I tried to avoid too much problems by excluding it from the FORWARD queue in the filter table, but maybe problems are lurking elsewhere. If somebody with better understanding of the interactions between bridges and iptables could verify I didn't break anything (or opened a security hole), that would be great.

@neocturne
Copy link
Member

Regarding the possible interactions between the firewall and bridges I have similar concerns. It might make sense to move this code into the batman-adv kernel module; this would probably not only improve robustness and performance, but also decrease code complexity.

@jplitza
Copy link
Member Author

jplitza commented Apr 7, 2016

I have never written kernel code before and am unsure whether I have the time to try.

The firewall setup by the way bears a huge problem when trying to get rid of the package, because the sysctl is still set to 1, but the firewall file is no longer present, meaning that all IPv6 traffic is blocked. Setting the sysctl explicitly to 0 in gluon-core's upgrade script or something like that would work around that issue.

gluon-radv-priorityd
====================

This package tries to prioritize the router advertisements of the gateway
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop that "tries to" or is it really that unreliable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is only prioritized if the client honors the preference field.

@ohrensessel
Copy link
Contributor

Independent of whether this method will be implemented within the batman-adv kernel module or in a userspace daemon: Wouldn't it be more appropriate to completely drop the RAs of the gateways that are not selected by batman-adv? This would circumvent the methodology relying upon the clients, but obviously would introduce a problem when the selected gateway does not send out RAs.

In my opinion this would be more similar to the approach batman-adv uses for DHCP.

@jplitza
Copy link
Member Author

jplitza commented May 15, 2016

I see two disadvantages with dropping RAs:

  • It is more intrusive, depriving the user of his choice of gateway. (Disobeying the preference value might be the user's choice!)
  • Addresses announced by non-preferred gateways cannot be used for incoming traffic either. You might have a static "official Freifunk XY prefix", that users can safely use as their public addresses, as well as some private, maybe even dynamic ones that are only announced by certain gateways. Although, we already have an "official" prefix, that is entered into the site.conf and announced by the node itself, so this argument is probably void.

But it also has one big advantage, almost necessity:

  • If everybody would share their native IPv6 connection (because Störerhaftung is finally gone… maybe), we would have hundreds of prefixes flying around the network. No client would handle that correctly.

@jplitza
Copy link
Member Author

jplitza commented May 21, 2016

Actually, if implemented in userspace (still what I would prefer), dropping can be realized much, much easier: Simply add an ebtables rule that drops RA packages from all but a given MAC address, and update that rule with a userspace daemon. That daemon would listen for RAs from the network (the ebtables rule would only block forwarding), and could determine for each incoming RA whether it is the new best source.

@RalfJung
Copy link
Contributor

I implemented something like this "dropping in userspace" as a very crude shellscript that runs in a cronjob: https://github.com/freifunk-saar/gluon-ffsaar/tree/master/gluon-filter-ra. It's not perfect, it's not flexible, but it seems to do the job here (just a few nodes with the experimental firmware have this so far). I also had the problem of initial RAs coming through before a selection was made, so I added a default rule to the ebtables which blocks all RAs and which is removed by the script as soon as a decision has been made.

How common is it for clients to (not) obey the preference of the RAs (I'm thinking about the usual OSes here)?

I can see your point of a static prefix that can be used by services that want a static address, actually I am struggling right now to find a good solution for exactly this case, since my hacky script drops all "other" announcements. But I think what I'd like to try is to just have all gateways announce that static prefix, but at a low priority; that would then ensure that all clients get both the static, low-priority prefix and the dynamic (as in, GW-dependent) high-priority prefix, and hopefully use the high-priority prefix for outgoing traffic. In such a configuration, dropping all foreign RAs would be the desired behavior.

@neocturne
Copy link
Member

Having a userspace daemon update ebtables rules to drop the RAs would be a very nice solution.

I don't think there are many RA clients around that actually respect the preference, so dropping would be preferable IMO.

@RalfJung
Copy link
Contributor

Having a userspace daemon update ebtables rules to drop the RAs would be a very nice solution.

We're now using the hack mentioned in my previous post in our Freifunk community, and it seems to be very effective -- the traffic between gateways dropped from "more than 20% of the overall traffic" to "almost nothing".

What's currently blocking this hack from being more generally usable is that it somehow has to translate the MAC address of the mesh connections (i.e., the one shown by batctl gwl) to the MAC address within the network (i.e., the sender MAC of the RAs). Is there a nice general way to do that?

@jplitza
Copy link
Member Author

jplitza commented Jul 27, 2016

This feature must not rely on the selected gateway providing IPv6 uplink. Imagine a setup where the selected gateway is only an IPv4 gateway and doesn't send RAs, then you would drop all RAs and not have IPv6 connectivity at all. (My proposal didn't drop anything, so it wasn't that much of a problem, though not the cleanest solution anyway)

Instead, the daemon should listen for RAs, rank them and add ebtables rules for all but the closest (in terms of network topology) gateway. I had once started to integrate this into gluon-radvd. The problem was the dependence on batman. Then, during an ffnordcon, @NeoRaider and I agreed that it would be best to have a mesh-protocol-agnostic possibility to sort different hosts (IP addresses) by proximity in the mesh. (Although that was during the time when reading the batman-adv originators file caused massive memory problems)

I still have this on my todo list and hope to have time for it in September.

@RalfJung
Copy link
Contributor

All right!

Yeah, it's true we are relying on all GWs sending RAs. However, they don't have to have an uplink themselves; they could be routing traffic through other GWs.

@neocturne neocturne modified the milestones: next, 2016.2 Jul 27, 2016
@jplitza
Copy link
Member Author

jplitza commented Jul 29, 2016

Implemented what was discussed here, closing this PR (as that new program has barely anything in common with this one).

@jplitza jplitza closed this Jul 29, 2016
@jplitza jplitza deleted the radv-priorityd branch August 9, 2016 15:38
@neocturne neocturne modified the milestone: next Aug 27, 2016
@jjsarton
Copy link

My point of view is that the approach of radv-priotityd is correct but has a little problem. On the real life there will be a lot of advertisement, each client joining the freifunk network will first sen a router solicitation, If the router send them self the RA and don't pass the solicitation to the gateways we will have a dramatically decrease of the RA traffic. Furthermore the router will be less solicited from incoming RA and processing them within the filter.
I have checked how different client work with RA and different priorities, this work well.
One problem with FF using more gateways is that we will have IPv6 traffic between the gateways attached to the FFRL AS system if we use a unique IPv6 prefix. Having a prefix per gateway swill solve this problem. On the other hand this mean that the cilents will have more addresses, this is not a problem.

@jplitza
Copy link
Member Author

jplitza commented Jan 11, 2017

@jjsarton As @NeoRaider wrote earlier, our experiments showed that far from all clients respect the field. And if I understand you correctly, you are proposing that the node intercepts the RS and only answers with its own RA to reduce spam. This was an idea to implement in gluon-radvd (now uradvd), but was dismissed, basically in favor of advancing the implementation of a Layer 3 mesh (using Babel).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. type: enhancement The changeset is an enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants