Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added top_level_domain field to identify connections to a domain suffix in log data. #542

Closed
wants to merge 2 commits into from

Conversation

mbudge
Copy link
Contributor

@mbudge mbudge commented Sep 6, 2019

Added top_level_domain field so Elasticsearch users with security use cases, can identify logs where the domain has a particular suffix. For example, if a company is experiencing a spear-phishing campaign, they may want to identify all connections to .xyz or .ru domains if the attackers using these domains in their links.

Having a top_level_domain suffix will allow users to find these connections without having to index domains in text fields, so they won't need to do expensive wildcard queries - "*.ru". Instead they can just do a fast keyword search in their analytics.

Users will be able to create a unique list of top_level_domains by using if statements in Logstash pipelines/filters.

Added top_level_domain field so Elasticsearch users with security use cases, can identify logs where the domain has a particular suffix. For example, if a company is experiencing a spear-phishing campaign, they may want to identify all connections to .xyz or .ru domains if the attackers using these domains in their links.

Having a top_level_domain suffix will allow users to find these connections without having to index domains in text fields, so they won't need to do expensive wildcard queries - "*.ru". Instead they can just do a fast keyword search in their analytics.

Users will be able to create a unique list of top_level_domains by using if statements in Logstash pipelines/filters.
@MikePaquette
Copy link
Contributor

@mbudge I think this is a good addition to ECS! I see that you've also added it consistently to everywhere that the *.registered_domain field is used. Just wondering if you think you'd use it in the dns field set as dns.question.top_level_domain ?

@mbudge
Copy link
Contributor Author

mbudge commented Sep 12, 2019

Hi Mike,

We'll mainly use it in the other schema's so maybe DNS is not necessary.

I'm happy for it to be left out.

Thanks.

The top level domain field might not be required in the DNS group.
Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this, I like this addition as well.

Could you add a changelog entry in CHANGELOG.next.md, please?

@MikePaquette I'm curious, did you actually want the field to be removed from the DNS field set? I think adding the tld field makes sense in precisely all of the places where we have registered_domain. Or am I missing something?

@webmat
Copy link
Contributor

webmat commented Sep 23, 2019

Note: only actual requested change for now is for the changelog.

We can decide whether we re-add under DNS once the discussion has concluded :-)

@MikePaquette
Copy link
Contributor

@webmat I was just questioning whether it was necessary. If it is useful, then I agree with your suggestion to add it back. In a spec like ECS consistency is goodness :-).

@webmat
Copy link
Contributor

webmat commented Sep 27, 2019

@elasticmachine, run elasticsearch-ci/docs

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing popped into mind, which will be important for this to align well with the definition of .registered_domain.

The description of the .top_level_domain fields should state that we're looking for the effective top level domain. For example we don't want "uk", we want "co.uk". No need to change the field name, just adjust the description. See review comments for the details.

Recap of requested changes, including prior ones

  • CHANGELOG.next.md entry
  • Add back dns.question.top_level_domain
  • Adjust the short definition
  • Adjust the definition to mention Mozilla's PSL instead

- name: top_level_domain
level: extended
type: keyword
short: The top level domain is the last part of the domain (com, net, org).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More concise, perhaps

The effective top level domain (com, org, net, co.uk).

Note the inclusion of co.uk

type: keyword
short: The top level domain is the last part of the domain (com, net, org).
description: >
The top level domain (TLD) also known as the domain suffix is the last part of the domain name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding the word "effective", and some commas, should be enough here IMO :-)

Suggested change
The top level domain (TLD) also known as the domain suffix is the last part of the domain name.
The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name.

The top level domain (TLD) also known as the domain suffix is the last part of the domain name.
For example, the top level domain for google.com is "com".

The following groups of top level domain are maintained by the Internet Assigned Numbers Authority (IANA).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sentence and the subsequent list should be replaced by a mention of Mozilla's public suffix list. You can take inspiration from the registered_domain definition for wording.

Of course both approaches can be used to figure out the eTLD, but Mozilla's list is the more direct method. One file, and you're able to determine the eTLD. Pointing users to a slew of lists like this will discourage them from even trying, IMO.

@webmat
Copy link
Contributor

webmat commented Sep 27, 2019

Here's a tip to preview the rendered docs, btw. New CI feature.

You can preview how the docs will look after this PR here: http://ecs_542.docs-preview.app.elstc.co/diff. Clicking on any link in there will take you to the relevant field set, including your changes :-)

Example for this PR's changes to URL: http://ecs_542.docs-preview.app.elstc.co/guide/en/ecs/master/ecs-url.html

webmat pushed a commit to webmat/ecs that referenced this pull request Oct 1, 2019
@webmat webmat mentioned this pull request Oct 1, 2019
@webmat webmat closed this in #572 Oct 3, 2019
webmat pushed a commit that referenced this pull request Oct 3, 2019
Added at:

- client.top_level_domain
- destination.top_level_domain
- dns.question.top_level_domain
- server.top_level_domain
- source.top_level_domain
- url.top_level_domain
@webmat webmat mentioned this pull request Oct 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants