Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Ipv6Addr::is_loopback to include IPv4-mapped loopback addresses #85655

Closed
wants to merge 3 commits into from

Conversation

CDirkx
Copy link
Contributor

@CDirkx CDirkx commented May 25, 2021

This PR addresses one of the issues that came up during the larger efforts of trying to stabilize the ip feature, see also the most recent stabilization PR #85585 and issue #85609 where I have tried to summarize all the relevant discussion so far.

Fixes #69772

Edit: for an alternative to this PR, see #86335 which commits to not supporting IPv4-in-IPv6 addresses (so leaving is_loopback unchanged).

Overview

There are two ways of representing an IPv4 address in a IPv6 address:

  • As an IPv4-compatible IPv6 address: ::1.2.3.4
  • As an IPv4-mapped IPv6 address: ::ffff:1.2.3.4

#69772 raised the issue that other languages take IPv4-mapped addresses into account in their equivalent of is_loopback, for example ::ffff:127.0.0.1. This raised the broader question about which (if any) methods of Ipv6Addr should consider IPv4-mapped or IPv4-compatible addresses.

While we don't yet have a clear answer in general, for is_loopback is seems pretty clear that supporting addresses like ::ffff:127.0.0.1 is a good idea: .NET, Java, Python and Go all follow this behaviour (#69772, #76098 (comment)). This behaviour is also implemented by many real world networking hardware/software (try ping ::ffff::127.0.0.1). In short, it is surprising for users that the Rust implementation is different. Regarding IPv4-compatible addresses, those are deprecated and of the tested other languages none took them into account (try also ping ::127.0.0.1).

Implementation

This PR changes the behaviour of is_loopback from

/// Returns [`true`] if this is a loopback address (::1).

to

/// Returns [`true`] if this is either:
/// - the [loopback address] as defined in [IETF RFC 4291 section 2.5.3] (`::1`).
/// - an [IPv4-mapped] address representing an IPv4 loopback address as defined in [IETF RFC 1122] (`::ffff:127.0.0.0/104`).
///
/// Note that this does not return [`true`] for an [IPv4-compatible] address representing an IPv4 loopback address (`::127.0.0.0/104`).

The documentation of Ipv6Addr is expanded to include a section explaining IPv4-compatible and IPv4-mapped addresses:

  • IPv4-compatible addresses are called out as officially deprecated and that only conversion to and from such an address is supported (to_ipv6_compatible, to_ipv4). It is also noted that no special meaning is ascribed to these addresses, so the embedded IPv4 address will never be taken into account.
  • IPv4-mapped addresses are explained and it is noted that the embedded IPv4 addresses may be taken into account, is_loopback is given as an example. is_loopback is specifically mentioned as currently the only method that takes IPv4-mapped addresses into account, and a small explanation is included that this is in line with other languages and real-world hardware.

The documentation of the methods to_ipv6_compatible, to_ipv6_mapped, to_ipv4 and to_ipv4_mapped is updated to refer to this section.

(Also included are some small improvements to the style consistency of the documentation (pulled out into #85676) and readability of some examples)

Unresolved Questions

Stability

This PR changes the behaviour of the stable is_loopback from a simple check for equality to Ipv6Addr::LOCALHOST to something more complex, although arguable the change makes the behaviour more in line in what users would expect. To what extent is such a change in behaviour covered by stability guarantees; is just a change in documentation enough or should it be called out in compatibility notes?

Making is_loopback special

It is expected that in the future some more methods of Ipv6Addr will take IPv4-mapped addresses into account, but this is currently blocked on nailing down the exact semantics (#85609). This PR is written to be largely agnostic to such further changes, is_loopback is called out as currently the only method that has this behaviour. However, if eventually the decision is made to not apply this behaviour to other methods, is it worth it to have only is_loopback have this special behaviour? Do the arguments in favour of supporting IPv4-mapped addresses pull their weight if only is_loopback will ever support it.

@rustbot rustbot added A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 25, 2021
@CDirkx
Copy link
Contributor Author

CDirkx commented May 25, 2021

Regarding making is_loopback special, I believe it is worth it to make this change, even if none of the other methods will ever take IPv4-mapped addresses into account. Most other languages don't offer as many utility methods other than is_localhost anyway, so for those Rust is pretty much free to decide what semantics it wants to offer. However for is_localhost every other major language we checked has the proposed behaviour, it seems inconsistent with real-world usage to not also implement this behaviour.

@CDirkx
Copy link
Contributor Author

CDirkx commented May 25, 2021

Regarding stability, would this PR require an FCP?

/// ```
#[rustc_const_stable(feature = "const_ipv6", since = "1.50.0")]
#[stable(since = "1.7.0", feature = "ip_17")]
#[rustc_allow_const_fn_unstable(const_ipv6)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ipv6Addr::to_ipv4_mapped is unstable and unstable const.

if let Some(ipv4) = self.to_ipv4_mapped() {
ipv4.is_loopback()
} else {
u128::from_be_bytes(self.octets()) == u128::from_be_bytes(Ipv6Addr::LOCALHOST.octets())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changed in this PR, but historical background for anyone wondering why the comparison is written like this:

  • self == Ipv6Addr::LOCALHOST relies on Eq/PartialEq, which is not possible in a const function.
  • self.octets() == Ipv6Addr::LOCALHOST.octets() leads to worse code generation, a problem with comparing arrays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CDirkx

self.octets() == Ipv6Addr::LOCALHOST.octets() leads to worse code generation, a problem with comparing arrays.

#85828 might fix that.

But also, does u128::from(self) generate reasonable code?

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 25, 2021
@the8472
Copy link
Member

the8472 commented May 25, 2021

Does this actually work with socket APIs?

E.g. is this guaranteed to connect even if one is the ipv4-mapped and the other is ipv6?

assert!(src.is_loopback() && dst.is_loopback());
socket.bind(src);
socket.connect(dst);

@CDirkx
Copy link
Contributor Author

CDirkx commented May 25, 2021

Mmm good question, that is a point that I have not seen before in this discussion. The answer is no, at least on my system (Windows) binding and connecting to mixed addresses give various errors. I can not even bind to ::ffff:127.0.0.1: "The requested address is not valid in its context."

The original issue mentions:

Once I bind my server to [::] to work on IPv4 AND IPv6 at the same time and then connect to it via v4 to 127.0.0.1 the is_loopback call returns false.

Only when handling both IPv4 and IPv6 addresses (and thus IPv4 addresses getting mapped to IPv6 addresses) are IPv4-mapped addresses expected to be taken into account. Maybe this is related to the fact that all of the other tested languages do not have separate types representing IPv4 or IPv6 addresses, like Rust has Ipv4Addr and Ipv6Addr. So the fact that those other languages all have a certain behaviour might have less relevance for Ipv6Addr, and makes me less certain that this PR is the right approach.

Not sure what to do about this, we could leave Ipv6Addr::is_loopback as is and explicitly document that neither IPv4-mapped and IPv4-compatible addresses are taken into account, along with an example of how to do this yourself with to_ipv4_mapped (which is currently unstable). Another option would be to instead implement this behaviour in IpAddr::is_loopback, as IpAddr kind-of represents handling "both IPv4 and IPv6" addresses. However then it could be confusing that IpAddr doesn't strictly delegate to either Ipv4Addr or Ipv6Addr.

@the8472
Copy link
Member

the8472 commented May 25, 2021

Maybe this is related to the fact that all of the other tested languages do not have separate types representing IPv4 or IPv6 addresses

Java does. And Inet6Address::isLoopbackAddress only checks for IPv6 loopback addresses, not ipv4-mapped ones.

https://github.com/openjdk/jdk/blob/0b7735938407fad5c2dbfb509d2d47bf172305e9/src/java.base/share/classes/java/net/Inet6Address.java#L318-L324

But all the public parser methods try to automatically parse v4-mapped ones to Inet4Address. If I recall correctly you have to do some gymnastics to create an Inet6Address holding a v4-mapped address.

@CDirkx
Copy link
Contributor Author

CDirkx commented May 25, 2021

Oh you're right, I forgot about Inet6Address.

It appears that in that case OpenJDK and Oracle Java have different behaviour: in Oracle Java both InetAddress and Inet6Address consider ::ffff:127.0.0.1 a loopback address (https://ideone.com/yAWa8b).

@the8472
Copy link
Member

the8472 commented May 25, 2021

You're using getByName there which is implemented by the superclass InetAddress and has a polymorphic return type, calling it on Inet6Address still invokes the superclass method. So you're actually getting an Inet4Address there on each call.

Afaik the only way to actually construct an ipv4-mapped one is to use Inet6Address::getByAddress and pass it as byte array.

@joshtriplett joshtriplett added the relnotes Marks issues that should be documented in the release notes of the next release. label May 31, 2021
@joshtriplett
Copy link
Member

This seems entirely reasonable to me, and has plenty of precedent from the standard libraries of other languages.

@rfcbot merge

@rfcbot
Copy link

rfcbot commented May 31, 2021

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels May 31, 2021
@the8472
Copy link
Member

the8472 commented May 31, 2021

No concerns currently listed.

I think there still are some unsettled questions in #85655 (comment) and #85655 (comment)

Imo it would be cleaner to have a method like to_canonical(self) -> IpAddr which turns an ipv4-mapped ipv6 address into IpAddr::V4 and all other ip6 addresses into IpAddr::V6 and then add IpAddr::is_localhost() which delegates to the inner.

This way we would have version-specific methods that don't have to know about the other version and the enum that delegates to the version-specific logic.

@joshtriplett
Copy link
Member

@the8472 I agree that that needs consideration. It's not obvious if all addresses that return true from .is_loopback() need to be routable, or if they should just semantically be loopback addresses. I can see an argument for either.

On the one hand, we could keep IPv6 and IPv4 addresses completely separate and people can always check for v4mapped addresses and convert/check those in their own code. On the other hand, we could make methods on IPv6 addresses check for the corresponding property on v4mapped addresses they're holding, and if someone wants to know that they don't have a v4mapped address they can check that. Either is justifiable; the question is which is more useful.

@the8472
Copy link
Member

the8472 commented May 31, 2021

can always check for v4mapped addresses and convert/check those in their own code

Yeah, but currently that's slightly inconvenient or noisy since you have to go through to_ipv4_mapped and do some extra matching, which is why I suggested adding to_canonical(). With that it would be as simple as

  • version-agnostic check: addr6.to_canonical().is_loopback()
  • version-specific check: addr6.is_loopback()

@joshtriplett
Copy link
Member

@rfcbot concern expected-behavior

Registering a concern to evaluate whether this is the expected behavior and semantics, or whether a different behavior would work better in practice.

@CDirkx
Copy link
Contributor Author

CDirkx commented May 31, 2021

The comment about Java's behaviour prompted me to look a bit deeper into Java and the other languages.

About IPv4-mapped addresses the Java docs state "Java will never return an IPv4-mapped address. These classes can take an IPv4-mapped address as input, both in byte array and text representation. However, it will be converted into an IPv4 address." So an Inet6Address representing ::ffff:127.0.0.1 is actually not constructable, it will always be converted to an Inet4Address. So in Java isLoopbackAddress on an Inet6Address is only true for ::1.

In Python there is also the different types IPv4Address and IPv6Address, and is_loopback on an IPv6Address is also only true for ::1: https://github.com/python/cpython/blob/46b16d0bdbb1722daed10389e27226a2370f1635/Lib/ipaddress.py#L2033-L2041

So my previous statement that all other tested languages implement this behaviour was wrong, I'm sorry for not being more careful there. What appears to be the case instead is that languages that don't have separate types for IPv4 and IPv6 addresses consider ::ffff:127.0.0.1 a loopback address (Go, .NET) and languages that have a separate type for IPv6 addresses do not (Java, Python).

In that case it would seem more consistent for Rust (which has separate Ipv4Addr and Ipv6Addr) to not accept this PR and keep is_loopback a simple check against Ipv6Addr::LOCALHOST (::1). That would mean we can also stick closer to just the specification, which mentions ::1 as the only IPv6 loopback address and doesn't describe any special behaviour for IPv4-mapped addresses.

JohnTitor added a commit to JohnTitor/rust that referenced this pull request Jun 9, 2021
Fix documentation style inconsistencies for IP addresses

Pulled out of rust-lang#85655 as it is unrelated. Fixes some inconsistencies in the docs for IP addresses:
- Currently some addresses are backticked, some are not, this PR backticks everything consistently. (looks better imo)
- Lowercase hex-literals are used when writing addresses.
@bors
Copy link
Contributor

bors commented Jun 9, 2021

☔ The latest upstream changes (presumably #86160) made this pull request unmergeable. Please resolve the merge conflicts.

@yaahc
Copy link
Member

yaahc commented Jun 14, 2021

In that case it would seem more consistent for Rust (which has separate Ipv4Addr and Ipv6Addr) to not accept this PR and keep is_loopback a simple check against Ipv6Addr::LOCALHOST (::1). That would mean we can also stick closer to just the specification, which mentions ::1 as the only IPv6 loopback address and doesn't describe any special behaviour for IPv4-mapped addresses.

This seems compelling to me, though sad given how much care you've clearly put into this PR. Would you like me to go ahead and FCP close this or did you want to close it yourself?

@CDirkx
Copy link
Contributor Author

CDirkx commented Jun 15, 2021

I can open up another PR that also documents IPv4-mapped and IPv4-compatible addresses and explicitly states that Rust will not assign any special meaning to those addresses, so IPv4-in-IPv6 are not directly supported. Even though that would only be a change in documentation, a FCP on that PR could finally settle this question, so it also doesn't need to be discussed again for other methods like is_documentation and is_global.

@CDirkx
Copy link
Contributor Author

CDirkx commented Jun 15, 2021

I have opened #86335 as an alternative to this PR, which commits to not supporting IPv4-in-IPv6 addresses (so leaving is_loopback unchanged).

@CDirkx
Copy link
Contributor Author

CDirkx commented Jul 1, 2021

One additional reason I thought of against this and in favour of #86335: If we were to unconditionally support IPv4-in-IPv6 addresses, I don't see any way to opt-out and get back the current behaviour. However with #86335 it will still be possible to explicitly opt-in to supporting IPv4-in-IPv6 addresses using to_ipv4_mapped or other convenience methods, and thus able to accommodate strictly more use cases.

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 18, 2021
@joshtriplett
Copy link
Member

@rfcbot cancel

Closing this in favor of #86335.

@rfcbot
Copy link

rfcbot commented Jul 21, 2021

@joshtriplett proposal cancelled.

@rfcbot rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 21, 2021
@yaahc
Copy link
Member

yaahc commented Jul 21, 2021

Checking off sfackler's checkbox since he's left the libs team.

@CDirkx CDirkx deleted the is-loopback branch July 22, 2021 13:54
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 3, 2021
Commit to not supporting IPv4-in-IPv6 addresses

Stabilization of the `ip` feature has for a long time been blocked on the question of whether Rust should support handling "IPv4-in-IPv6" addresses: should the various `Ipv6Address` property methods take IPv4-mapped or IPv4-compatible addresses into account. See also the IPv4-in-IPv6 Address Support issue rust-lang#85609 and rust-lang#69772 which originally asked the question.

# Overview

In the recent PR rust-lang#85655 I proposed changing `is_loopback` to take IPv4-mapped addresses into account, so `::ffff:127.0.0.1` would be recognized as a looback address. However, due to the points that came up in that PR, I alternatively propose the following: Keeping the current behaviour and commit to not assigning any special meaning for IPv4-in-IPv6 addresses, other than what the standards prescribe. This would apply to the stable method `is_loopback`, but also to currently unstable methods like `is_global` and `is_documentation` and any future methods. This is implemented in this PR as a change in documentation, specifically the following section:

> Both types of addresses are not assigned any special meaning by this implementation, other than what the relevant standards prescribe. This means that an address like `::ffff:127.0.0.1`, while representing an IPv4 loopback address, is not itself an IPv6 loopback address; only `::1` is. To handle these so called "IPv4-in-IPv6" addresses, they have to first be converted to their canonical IPv4 address.

# Discussion

In the discussion for or against supporting IPv4-in-IPv6 addresses the question what would be least surprising for users of other languages has come up several times. At first it seemed most big other languages supported IPv4-in-IPv6 addresses (or at least considered `::ffff:127.0.0.1` a loopback address). However after further investigation it appears that supporting IPv4-in-IPv6 addresses comes down to how a language represents addresses. .Net and Go do not have a separate type for IPv4 or IPv6 addresses, and do consider `::ffff:127.0.0.1` a loopback address. Java and Python, which do have separate types, do not consider `::ffff:127.0.0.1` a loopback address. Seeing as Rust has the separate `Ipv6Addr` type, it would make sense to also not support IPv4-in-IPv6 addresses. Note that this focuses on IPv4-mapped addresses, no other language handles IPv4-compatible addresses.

Another issue that was raised is how useful supporting these IPv4-in-IPv6 addresses would be in practice. Again with the example of `::ffff:127.0.0.1`, considering it a loopback address isn't too useful as to use it with most of the socket APIs it has to be converted to an IPv4 address anyway. From that perspective it would be better to instead provide better ways for doing this conversion like stabilizing `to_ipv4_mapped` or introducing a `to_canonical` method.

A point in favour of not supporting IPv4-in-IPv6 addresses is that that is the behaviour Rust has always had, and that supporting it would require changing already stable functions like `is_loopback`. This also keeps the documentation of these functions simpler, as we only have to refer to the relevant definitions in the IPv6 specification.

# Decision

To make progress on the `ip` feature, a decision needs to be made on whether or not to support IPv4-in-IPv6 addresses.
There are several options:

- Keep the current implementation and commit to never supporting IPv4-in-IPv6 addresses (accept this PR).
- Support IPv4-in-IPv6 addresses in some/all `IPv6Addr` methods (accept PR rust-lang#85655).
- Keep the current implementation and but not commit to anything yet (reject both this PR and PR rust-lang#85655), this entire issue will however come up again in the stabilization of several methods under the `ip` feature.

There are more options, like supporting IPv4-in-IPv6 addresses in `IpAddr` methods instead, but to my knowledge those haven't been seriously argued for by anyone.

There is currently an FCP ongoing on PR rust-lang#85655. I would ask the libs team for an alternative FCP on this PR as well, which if completed means the rejection of PR rust-lang#85655, and the decision to commit to not supporting IPv4-in-IPv6 addresses.

If anyone feels there is not enough evidence yet to make the decision for or against supporting IPv4-in-IPv6 addresses, let me know and I'll do whatever I can to resolve it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std::net::IpAddr: is_loopback failing on ipv4-in-ipv6 addresses
10 participants