-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added top_level_domain field to identify connections to a domain suffix in log data. #542
Conversation
Added top_level_domain field so Elasticsearch users with security use cases, can identify logs where the domain has a particular suffix. For example, if a company is experiencing a spear-phishing campaign, they may want to identify all connections to .xyz or .ru domains if the attackers using these domains in their links. Having a top_level_domain suffix will allow users to find these connections without having to index domains in text fields, so they won't need to do expensive wildcard queries - "*.ru". Instead they can just do a fast keyword search in their analytics. Users will be able to create a unique list of top_level_domains by using if statements in Logstash pipelines/filters.
@mbudge I think this is a good addition to ECS! I see that you've also added it consistently to everywhere that the |
Hi Mike, We'll mainly use it in the other schema's so maybe DNS is not necessary. I'm happy for it to be left out. Thanks. |
The top level domain field might not be required in the DNS group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this, I like this addition as well.
Could you add a changelog entry in CHANGELOG.next.md
, please?
@MikePaquette I'm curious, did you actually want the field to be removed from the DNS field set? I think adding the tld field makes sense in precisely all of the places where we have registered_domain
. Or am I missing something?
Note: only actual requested change for now is for the changelog. We can decide whether we re-add under DNS once the discussion has concluded :-) |
@webmat I was just questioning whether it was necessary. If it is useful, then I agree with your suggestion to add it back. In a spec like ECS consistency is goodness :-). |
@elasticmachine, run elasticsearch-ci/docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing popped into mind, which will be important for this to align well with the definition of .registered_domain
.
The description of the .top_level_domain
fields should state that we're looking for the effective top level domain. For example we don't want "uk", we want "co.uk". No need to change the field name, just adjust the description. See review comments for the details.
Recap of requested changes, including prior ones
- CHANGELOG.next.md entry
- Add back
dns.question.top_level_domain
- Adjust the short definition
- Adjust the definition to mention Mozilla's PSL instead
- name: top_level_domain | ||
level: extended | ||
type: keyword | ||
short: The top level domain is the last part of the domain (com, net, org). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More concise, perhaps
The effective top level domain (com, org, net, co.uk).
Note the inclusion of co.uk
type: keyword | ||
short: The top level domain is the last part of the domain (com, net, org). | ||
description: > | ||
The top level domain (TLD) also known as the domain suffix is the last part of the domain name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just adding the word "effective", and some commas, should be enough here IMO :-)
The top level domain (TLD) also known as the domain suffix is the last part of the domain name. | |
The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. |
The top level domain (TLD) also known as the domain suffix is the last part of the domain name. | ||
For example, the top level domain for google.com is "com". | ||
|
||
The following groups of top level domain are maintained by the Internet Assigned Numbers Authority (IANA). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sentence and the subsequent list should be replaced by a mention of Mozilla's public suffix list. You can take inspiration from the registered_domain
definition for wording.
Of course both approaches can be used to figure out the eTLD, but Mozilla's list is the more direct method. One file, and you're able to determine the eTLD. Pointing users to a slew of lists like this will discourage them from even trying, IMO.
Here's a tip to preview the rendered docs, btw. New CI feature. You can preview how the docs will look after this PR here: http://ecs_542.docs-preview.app.elstc.co/diff. Clicking on any link in there will take you to the relevant field set, including your changes :-) Example for this PR's changes to URL: http://ecs_542.docs-preview.app.elstc.co/guide/en/ecs/master/ecs-url.html |
Added top_level_domain field so Elasticsearch users with security use cases, can identify logs where the domain has a particular suffix. For example, if a company is experiencing a spear-phishing campaign, they may want to identify all connections to .xyz or .ru domains if the attackers using these domains in their links.
Having a top_level_domain suffix will allow users to find these connections without having to index domains in text fields, so they won't need to do expensive wildcard queries - "*.ru". Instead they can just do a fast keyword search in their analytics.
Users will be able to create a unique list of top_level_domains by using if statements in Logstash pipelines/filters.