Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert the Filebeat auditd module to ECS #10192

Merged
merged 22 commits into from
Jan 30, 2019
Merged

Conversation

webmat
Copy link
Contributor

@webmat webmat commented Jan 19, 2019

Caveats

Newly defined fields

  • user.audit.group.id
  • user.audit.id
  • user.effective.group.id
  • user.effective.id
  • user.filesystem.group.id
  • user.filesystem.id
  • user.owner.group.id
  • user.owner.id
  • user.saved.group.id
  • user.saved.id

Renames

  • auditd.log.acct => user.name
  • auditd.log.addr => source.address
  • auditd.log.agid => user.audit.group.id
  • auditd.log.arch => host.architecture
  • auditd.log.auid => user.audit.id
  • auditd.log.cmd => process.args (went from cmdline to args array, so no alias)
  • auditd.log.comm => process.name
  • auditd.log.dst => destination.address
  • auditd.log.egid => user.effective.group.id
  • auditd.log.euid => user.effective.id
  • auditd.log.exe => process.executable
  • auditd.log.fsgid => user.filesystem.group.id
  • auditd.log.fsuid => user.filesystem.id
  • auditd.log.geoip => source.geo.*
  • auditd.log.gid => user.group.id
  • auditd.log.laddr => destination.address
  • auditd.log.lport => destination.port
  • auditd.log.msg => message
  • auditd.log.ogid => user.owner.group.id
  • auditd.log.ouid => user.owner.id
  • auditd.log.pid => process.pid (long)
  • auditd.log.ppid => process.ppid (long)
  • auditd.log.record_type => event.action (lowercased)
  • auditd.log.res => event.outcome
  • auditd.log.rport => source.port
  • auditd.log.sgid => user.saved.group.id
  • auditd.log.src => source.address
  • auditd.log.suid => user.saved.id
  • auditd.log.terminal => user.terminal
  • auditd.log.tty => user.terminal
  • auditd.log.uid => user.id

TODO

  • Add more entries in the default log, to see more situations
  • Document in ecs-migration.yml
  • Add missing field definitions for fields that are migrating
  • Alias renamed fields to their ECS counterpart
  • Add field definitions for all of these nested user/group fields under user.
  • Changelog
  • Confirm if .tty and .terminal should really all go to process.terminal. Perhaps one is meant for user.terminal?

@webmat webmat requested a review from a team as a code owner January 19, 2019 03:34
@webmat webmat self-assigned this Jan 19, 2019
@webmat webmat added in progress Pull request is currently in progress. module Filebeat Filebeat ecs labels Jan 19, 2019
@ruflin ruflin mentioned this pull request Jan 19, 2019
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the idea

filebeat/module/auditd/log/ingest/pipeline.json Outdated Show resolved Hide resolved
@cwurm
Copy link
Contributor

cwurm commented Jan 21, 2019

Keeping user and group separate complicates the structure, I think:

  1. There's a lot of fields with two mentions of "user", e.g. user.audit.user.id. The additional, second user doesn't really add any new information though, we already know this is user information from the first. It could just be user.audit.id and still be unique and understandable.
  2. Not nesting the top-level group under user creates an inconsistency: Top-level group.id is its own object, but then all other group.id fields are nested under user. - begging the question of why the top-level one is not nested as well, as user.group.id. The other disadvantage is readability and searchability, e.g. in the JSON output user.id and user.group.id will be just a few rows apart and are clearly connected since they are part of the same top-level object, whereas user.id and group.id are in completely different parts of the event often dozens of lines apart and it's no longer easily possible to find them or know that they are connected.
    I think we don't need to dismiss the top-level group object completely though. It could be used to hold group information when the group is a standalone object, e.g. when a group has been changed. In this case though, all groups incl. the top-level group are just attributes of the user ("the user's group`, "the effective users's group", etc.).

In essence, what I'm proposing is that the user object be defined like:

"user": {
  "id": <real UID or primary SID>,
  "name": <id name>,
  
  "group": {
    "id": <real GID or primary group SID>
    "name": <id name>
  }

  "effective": <user object>,
  "audit": <user object>,
  etc.
}

@webmat
Copy link
Contributor Author

webmat commented Jan 21, 2019

@cwurm Yes, I see what you mean. And to complete the part related to the group a little more, you're proposing this?

"user": {
  "id": <real UID or primary SID>,
  "name": <id name>,
  
  "group": {
    "id": <real GID or primary group SID>
    "name": <id name>
    "audit": { "id": "...", "name": "..." }
    "effective": { "id": "...", "name": "..." }
    ...
  }
  "effective": <user object>,
  "audit": <user object>,
  etc.
}

@cwurm
Copy link
Contributor

cwurm commented Jan 21, 2019

@webmat I saw group as a part of the user object, so all of them would have one:

"user": {
  "id": <real UID or primary SID>,
  "name": <id name>,
  
  "group": {
    "id": <real GID or primary group SID>
    "name": <id name>
  }

  "effective": {
    "id": "...",  // effective user ID
    "name": "..." // effective user name
    "group": { "id": "...", "name": "..." } // effective user group ID and name
  },

  "audit": <user object>, // same as "effective", incl. a group object
  etc.
}

@cwurm
Copy link
Contributor

cwurm commented Jan 21, 2019

@webmat
Copy link
Contributor Author

webmat commented Jan 21, 2019

Ah, gotcha. Yes, this makes sense. I'll be going with this as well.

@webmat
Copy link
Contributor Author

webmat commented Jan 21, 2019

@MikePaquette What do you think of this way of nesting user/group data, when reporting on the entities at play to determine effective rights? (See Christoph's comment above)

@webmat
Copy link
Contributor Author

webmat commented Jan 21, 2019

When I try to move to this structure, I'm getting a mapping exception, because currently ECS defines user.group as a string.

I think we should actually consider making this change to ECS for 1.0.0 GA, and have user.group become a place where the group field set can be reused. The current user.group string field actually goes against our principle of not reusing the same name in a different way elsewhere.

@webmat
Copy link
Contributor Author

webmat commented Jan 21, 2019

To clarify, I would not go ahead and add all of these nested values for user/group IDs and names part of ECS for GA just yet. I'm proposing changing only this one field, user.group.

@webmat
Copy link
Contributor Author

webmat commented Jan 21, 2019

I created elastic/ecs#304 to discuss this. We could get this changed and re-imported this week.

In the meantime, I'll work around this and take this PR as far as it can go.

@webmat webmat requested a review from a team as a code owner January 21, 2019 20:44
@webmat
Copy link
Contributor Author

webmat commented Jan 21, 2019

Rebased, and temporarily introduced field name user.group_.id, for the purpose of the review.

Let's see if we decide to go ahead with elastic/ecs#304, to see what we do next.

@webmat webmat requested review from a team as code owners January 23, 2019 21:34
@webmat
Copy link
Contributor Author

webmat commented Jan 23, 2019

Rebased over #10275. Once #10275 is merged, this will need a final rebase.

@webmat
Copy link
Contributor Author

webmat commented Jan 24, 2019

Blocked by #10306.

@webmat webmat added the blocked label Jan 24, 2019
@webmat webmat changed the title WIP Convert the Filebeat auditd module to ECS Convert the Filebeat auditd module to ECS Jan 24, 2019
@webmat webmat added the review label Jan 24, 2019
@webmat
Copy link
Contributor Author

webmat commented Jan 29, 2019

Ready for final review.

New tonight:

  • migrate a few more fields that map to address and port (source and destination)
  • migrate cmd (command line string) to process.args (array)
  • map tty and terminal to user.terminal, not process.terminal
  • [BUG] fix a mistake with the module's main fields.yml file: it must end with the opening of the field group for each fileset. I had added global fields after it, which meant the fileset fields were nested completely in the wrong place.

@ycombinator ycombinator removed the request for review from a team January 29, 2019 05:16
Copy link
Contributor

@cwurm cwurm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just two questions on mapping multiple fields into one, and a fix needed for the module header in the reference files.

beat: filebeat

- from: auditd.log.tty
to: user.terminal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure terminal and tty will never be both filled? What would happen if they were?

Copy link
Contributor Author

@webmat webmat Jan 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree it's not perfect. Here's what I considered:

  • Based on the logs samples we have, only one of them is used at a time.
  • Based on the docs I've seen, there isn't a big difference between the two (one's the "terminal", the other's the "terminal device"). I get the sense different auditd event sources use one of the two interchangeably. Even if you look for terminal= in the rhel7 log file, you'll see "/dev/pts/0" and "pts/0" being used interchangeably.

Another thing we could do is append it instead. So if both are set at the same time, you'll get `user.terminal: ["pts0", "/dev/pts/0"] instead of one of the two being overwritten.

Or perhaps we can leave tty where it is for now, only map terminal to user.terminal for now and revisit this later.

The last option seems the most straightforward in that it's making sure we don't introduce a bug (the overwrite), while still taking a step towards normalization (at least mapping terminal).

Finally, the same potential overwrite is happening with the IP addresses. It's hard to tell from the PR body because the fields are sorted alphabetically. But in the pipeline it's more visible, as they're grouped by concern. Address manipulation is starting here

@@ -69,7 +69,7 @@ filebeat.modules:
# can be added under this section.
#input:

#-------------------------------- Auditd Module --------------------------------
#--------------------------------- User Module ---------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this changed because the title of the first key in fields.yml is now User (see https://github.com/elastic/beats/blob/master/dev-tools/mage/config.go#L245). You'll have to change the order in that file.

@@ -1,3 +1,120 @@
- key: user
title: "User"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That this comes first causes filebeat.reference.yml to change Auditd Module to User Module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right, I'll fix. This file is tricky.

@@ -69,7 +69,7 @@ filebeat.modules:
# can be added under this section.
#input:

#-------------------------------- Auditd Module --------------------------------
#--------------------------------- User Module ---------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@webmat
Copy link
Contributor Author

webmat commented Jan 29, 2019

Ok, here's the latest development. The idea is to move forward a bit without introducing potential bugs. We can keep making things better later (even after 7.0, I would think).

  • I fixed the issue with fields.yml causing weirdness in the docs
  • I undid the tty => user.terminal transition. Only terminal is mapped to user.terminal
  • I undid a network address/port mapping that was bugging me anyway. What bugged me: assuming "remote" was the "source". So I got rid of the mapping of the pair: addr, rport, laddr, lport
  • All fields for which I undid the migration are now documented, however.

I think it's good enough like this. The rest is incremental improvements that may require more in depth tought.

Copy link
Contributor

@cwurm cwurm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, and LGTM.

@webmat webmat merged commit f28cf30 into elastic:master Jan 30, 2019
@webmat webmat deleted the ecs-auditd-fb branch January 30, 2019 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ecs Filebeat Filebeat in progress Pull request is currently in progress. module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants