Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: IP Address Version UDF #13269

Closed
hariharsubramanyam opened this issue Aug 22, 2019 · 14 comments
Closed

Feature Request: IP Address Version UDF #13269

hariharsubramanyam opened this issue Aug 22, 2019 · 14 comments

Comments

@hariharsubramanyam
Copy link

hariharsubramanyam commented Aug 22, 2019

EDIT: Resolved - no need for a UDF. Do this:

SELECT
    IF(
        CAST(ip_address AS IPADDRESS) BETWEEN IPADDRESS '0.0.0.0'
            AND IPADDRESS '255.255.255.255',
        'v4',
        'v6'
    )

This will not work for IPv6 addresses that take the form <80 zeroes>:ffff:<some stuff>, but you probably won't see those in practice.

===========
Name: IP_VERSION(ip_address)

Arguments:

  • ip_address (type IPADDRESS)

Return type: VARCHAR
Return value: Returns 'v4' if the IP address is an IPv4. Returns 'v6' if the IP address is IPv6. Returns NULL if the IPADDRESS is invalid (I don't think this is actually possible).
Implementation:

I think Java's InetAddress type (and related types) could let us do this.

https://docs.oracle.com/javase/7/docs/api/java/net/InetAddress.html

It's probably easy to implement ourselves too.

@rongrong
Copy link
Contributor

Isn't this possible as if(ip >= ..., 'v6', 'v4')?

@hariharsubramanyam
Copy link
Author

I don't think that's possible. While it's true that IPv6 has more bits (128) than IPv4 (32), it is possible for the two addresses to still take on the same value.

@hariharsubramanyam
Copy link
Author

hariharsubramanyam commented Aug 22, 2019

For example:

SELECT CAST('::1' AS IPADDRESS), CAST('0.0.0.1' AS IPADDRESS)

The former is IPv6 and the latter is IPv4. Numerically, they both take on the value 1.

@hariharsubramanyam
Copy link
Author

The two types appear to be incomparable as well:

SELECT CAST('::1' AS IPADDRESS) > CAST('255.255.255.255' AS IPADDRESS),
    CAST('::1' AS IPADDRESS) > CAST('0.0.0.0' AS IPADDRESS)

@hariharsubramanyam
Copy link
Author

hariharsubramanyam commented Aug 22, 2019

Is there a way I can see how many bits were allocated to store the IPADDRESS?

Edit: Just checked, I don't think we can do that.

@rongrong
Copy link
Contributor

For example:

SELECT CAST('::1' AS IPADDRESS), CAST('0.0.0.1' AS IPADDRESS)

The former is IPv6 and the latter is IPv4. Numerically, they both take on the value 1.

Hmm, if you want to distinguish whether an IP is v6 or v4 by format, that's not possible. Internally all ip addresses are represented in v6. ::1 is an IP address represented as v6 format in v4 range. What's the need to distinguish ::1 as v6 and 0.0.0.1 as v4?

@rongrong
Copy link
Contributor

Actually 0.0.0.1 in IPv6 is ::ffff:0:1

@rongrong
Copy link
Contributor

The two types appear to be incomparable as well:

SELECT CAST('::1' AS IPADDRESS) > CAST('255.255.255.255' AS IPADDRESS),
    CAST('::1' AS IPADDRESS) > CAST('0.0.0.0' AS IPADDRESS)

They are comparable, otherwise the above query would fail.

@hariharsubramanyam
Copy link
Author

Ah, I see. I assumed the IPv4 addresses were prefixed with 0s instead of 1s.

The context is that we have a table with IP addresses in string form. The table does not indicate whether the IP address is v4 or v6. We want to compute the version of the IP address, based on the string. This is an important signal we need for other parts of our system.

I was thinking we could just cast the string to IPADDRESS and then call a IP_VERSION function on it. That seems to be difficult, given that IPv4 and IPv6 addresses use the same datatype for storage.

So, perhaps we could instead have a function of the form:

IP_VERSION(ip_address) where ip_address is a VARCHAR rather than a string?

@rongrong
Copy link
Contributor

You can do IF(CAST(ip_address as IPADDRESS) BETWEEN IPADDRESS'0.0.0.0' AND IPADDRESS'255.255.255.255', 'v4', 'v6')

@hariharsubramanyam
Copy link
Author

That looks awesome! It will be amazing if we can do this without a UDF! Let me try it out and verify that it works for our use-case.

@rongrong
Copy link
Contributor

Once we ship SQL function support we can introduce this as a SQL function (#9613). We are targeting end of H2 for that.

@hariharsubramanyam
Copy link
Author

I did find one case that breaks:

SELECT
    IF(
        CAST('0000:0000:0000:0000:0000:ffff:1234:5678' AS IPADDRESS) BETWEEN IPADDRESS '0.0.0.0'
            AND IPADDRESS '255.255.255.255',
        'v4',
        'v6'
    )

I was hoping it would be classified v6, but the return value is v4.

This is not a surprise given that IPv4 addresses are mapped to IPv6 addresses as you described (https://tools.ietf.org/html/rfc4291.html#section-2.5.5.2).

That being said, I don't believe IPv6 addresses of the form <80 zeros>:ffff: correspond to legitimate IPv6 addresses, so we may be able to ignore this case.

@hariharsubramanyam
Copy link
Author

So, I'm going to close this task because the solution described here is good enough for our use-case, and probably many others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants