Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geoip not able to use IPv6 address with zone ID #37107

Closed
jsoriano opened this issue Jan 3, 2019 · 21 comments
Closed

Geoip not able to use IPv6 address with zone ID #37107

jsoriano opened this issue Jan 3, 2019 · 21 comments
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team

Comments

@jsoriano
Copy link
Member

jsoriano commented Jan 3, 2019

IP grok pattern correctly parses IPv6 addresses with zone ids, but if these addresses are used with the geoip processor it fails with errors like '::1%0' is not an IP string literal..

To avoid confusion I think that geoip module should accept zone ids, even if they are not relevant in this case and ignored.

Seen in elastic/beats#9836

@jsoriano jsoriano added discuss :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Jan 3, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@ghost
Copy link

ghost commented Jan 8, 2019

@jsoriano I would love to start on this issue , but I'm newbie to this.. how can I contribute?

@jsoriano
Copy link
Member Author

jsoriano commented Jan 8, 2019

Hi @Asitha26, this is great 🙂

To start please take a look to the contributing guidelines.

You can find the code of the geoip module using these addresses in modules/ingest-geoip/src/main/java/org/elasticsearch/ingest/geoip/GeoIpProcessor.java and the code parsing the address in server/src/main/java/org/elasticsearch/common/network/InetAddresses.java.

I have been taking a quick look and the java class for IPV6 addresses (https://docs.oracle.com/javase/10/docs/api/java/net/Inet6Address.html) supports zone ids (as scope ids), we could add support for that on InetAddresses when parsing the address and when instantiating the InetAddress.

@ghost
Copy link

ghost commented Jan 8, 2019

thanks, I'll have a look

@jakelandis
Copy link
Contributor

Also please note that this change should constrained to the geo-ip plugin and not a generic change to Elasticsearch. (see #22400 (comment))

@ghost
Copy link

ghost commented Jan 9, 2019

@jakelandis sure

@ghost
Copy link

ghost commented Jan 9, 2019

I'm unable to build on my machine i have switched to jdk 11 and have set path variables but when i try importing project as a gradle on eclipse it throws an exception
capture

@cshtdd
Copy link

cshtdd commented Jan 15, 2019

@jsoriano Can this issue be closed since there are several merged pull requests that reference it?

@jsoriano
Copy link
Member Author

@camilin87 merged PRs referencing this issue are workarounds mitigating this issue. The issue is still present in ES.

@jsoriano
Copy link
Member Author

@Asitha26 were you able to progress on this issue?

@ghost
Copy link

ghost commented Jan 16, 2019

@jsoriano no actually I'm stuck with environment issue , my gradle build wasn't success and hence couldent continue can you help me getting environment fixed

cshtdd added a commit to cshtdd/elasticsearch that referenced this issue Jan 16, 2019
@cshtdd
Copy link

cshtdd commented Jan 16, 2019

I'm able to reproduce the issue.

Here's a failing test cshtdd@2a8913f

So far I've found the problem lies in the InetAddresses.ipStringToBytes it expects the ipString to contain only digits, colons or dots. Everything else will produce an invalid address.

The method textToNumericFormatV6 will also need some modification.

The IPv6 RFC clearly states

`%' is a delimiter character to distinguish between

and <zone_id>

I'll continue working on this.
Any feedback is appreciated.

@cshtdd
Copy link

cshtdd commented Jan 16, 2019

I created a pull request to ignore the zone ID in IPv6 addresses #37547

This will be a first step to address this issue and elastic/beats#9836

I'll continue working on building IPv6 addresses that have the correct zone ID

@jasontedor
Copy link
Member

jasontedor commented Jan 16, 2019

@jakelandis Can you elaborate on the thinking behind your comment? Why would we add support in ingest-geoip but not in core Elasticsearch given that we previously decided against supporting it within Elasticsearch? I think that either:

  • we support it in both places (so revisiting the previous decision)
  • we support it in neither place

@jakelandis jakelandis removed the good first issue low hanging fruit label Jan 16, 2019
@jakelandis
Copy link
Contributor

jakelandis commented Jan 16, 2019

My thought process was mostly around the scope of the change (both technical and functional). Supporting this fully requires more consideration and potentially impacts things lower level items like the IPData type. Supporting it only for the ingest-geoip, I see less about supporting zoneID's and more about being more lenient in what is what is accepted for a very specific case. However, in hindsight, supporting this only for that one use case would likely only change where the error happens.

It looks like @camilin87 PR #37547 started down the path of a more core change with the implicit decision to always ignore the zoneId. Always ignoring it is one way to support it, but not sure that is the direction we want to go.

Can we re-visit the previous decision ?

I don't think an error is the correct behavior since this is a valid IP address and will be encountered. But I also don't think we can simply ignore the zoneId for the core index case. (please correct me if I am wrong) We don't have any precedence (nor want) for mutating data by default. Perhaps a new ingest processor to explicitly handle this ? (let me know if I should open a new issue for discussion).

@camilin87 - We appreciate the work here, but probably best to get this sorted out before committing too much more time here.

@cshtdd
Copy link

cshtdd commented Jan 16, 2019

Always ignoring it is one way to support it, but not sure that is the direction we want to go.

Makes sense.

We appreciate the work here, but probably best to get this sorted out before committing too much more time here.

Thanks. I'm almost done with the part of the change where the zoneId is read and printed

https://github.com/camilin87/elasticsearch/commits/issue_37107_part2

Are the changes in this area still relevant?
Or you want to table the issue altogether?

@cshtdd
Copy link

cshtdd commented Jan 17, 2019

I know this issue may need some discussion.
I updated the PR #37547 either way in case it adds some clarity.

That being said. I understand if after your discussions this issue is no longer relevant

@jakelandis
Copy link
Contributor

We discussed this internally and have landed on implementing an ingest node processor to help handle IPv6 zone_ids. I have logged #38064.

@jsoriano
Copy link
Member Author

@jakelandis the ip processor idea sounds great.

I think that any processor accepting an IP as input should accept the IPs parsed by IP grok pattern, but having a processor that transparently handles IPs sound good too.

At the end I think that one of the following pipelines should work no matter the type of IP or if it has a zone id, so the pipelines don't have to handle specifics of IP addresses formats.

[
{ "grok": { "field": "message", "patterns":["...%{IP:source.ip}..."]} },
{ "geoip": { "field": "source.ip", target_field:.... } },
]
[
{ "grok": { "field": "message", "patterns":["...%{IP:source.ip}..."]} },
{ "ip": { "field": "source.ip" ... } },
{ "geoip": { "field": "source.ip", target_field:.... } },
]

@iquirino
Copy link

+1 on latest version (installation this weekend)

@rjernst rjernst added the Team:Data Management Meta label for data/management team label May 4, 2020
@dakrone
Copy link
Member

dakrone commented May 17, 2024

This has been open for quite a while, and we haven't made much progress on this due to focus in other areas. For now I'm going to close this as something we aren't planning on implementing. We can re-open it later if needed.

@dakrone dakrone closed this as not planned Won't fix, can't repro, duplicate, stale May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

9 participants