Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeat on Agent failed to start due to permission denied #33292

Closed
aleksmaus opened this issue Oct 10, 2022 · 8 comments · Fixed by #33584
Closed

Heartbeat on Agent failed to start due to permission denied #33292

aleksmaus opened this issue Oct 10, 2022 · 8 comments · Fixed by #33584
Assignees
Labels
Agent bug Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Comments

@aleksmaus
Copy link
Member

This issue is created for SDH https://github.com/elastic/sdh-beats/issues/2585.
Confirmed for the beats version 8.3.3.

The hearbeat fails to start intermittently due to:

`Exiting: error setting default paths: failed to create data path /appl/cep/bin/data/Elastic/Agent/data/elastic-agent-0ffbed/run/84e76000-3e38-11ed-88e6-afd537e012ac/heartbeat--8.3.3: mkdir /appl/cep/bin/data/Elastic: permission denied[\\n`](file://n%60/)

The latest theory is that it might have something to do when and how the heartbeat seccomp filter is applied. Depending on the timing it looks like the hearbeat might try to create paths at the time when it is restricted from calling the mkdirat syscall.

Here is the latest comment from the SDH:

here is another piece of the puzzle, the file system audit when the problem happens.
you can see that the process uid is root (0)
and the syscall mkdirat fails with EACCES on /home/amaus/work/elastic-agent-8.3.3-linux-x86_64444/data/elastic-agent-0ffbed/run

type=UNKNOWN[1420] msg=audit(10/08/2022 18:14:32.425:3573) : subj_apparmor=unconfined 
type=PROCTITLE msg=audit(10/08/2022 18:14:32.425:3573) : proctitle=/home/amaus/work/elastic-agent-8.3.3-linux-x86_64444/data/elastic-agent-0ffbed/install/heartbeat-8.3.3-linux-x86_64/heartbeat -E 
type=PATH msg=audit(10/08/2022 18:14:32.425:3573) : item=1 name=/home/amaus/work/elastic-agent-8.3.3-linux-x86_64444/data/elastic-agent-0ffbed/run obj=? nametype=CREATE cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=UNKNOWN[1421] msg=audit(10/08/2022 18:14:32.425:3573) :

type=PATH msg=audit(10/08/2022 18:14:32.425:3573) : item=0 name=/home/amaus/work/elastic-agent-8.3.3-linux-x86_64444/data/elastic-agent-0ffbed/ inode=7607811 dev=08:05 mode=dir,775 ouid=amaus ogid=amaus rdev=00:00 obj=? nametype=PARENT cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=UNKNOWN[1421] msg=audit(10/08/2022 18:14:32.425:3573) :

type=CWD msg=audit(10/08/2022 18:14:32.425:3573) : cwd=/home/amaus/work/elastic-agent-8.3.3-linux-x86_64444/data/elastic-agent-0ffbed/install/heartbeat-8.3.3-linux-x86_64 
type=SYSCALL msg=audit(10/08/2022 18:14:32.425:3573) : arch=x86_64 syscall=mkdirat success=no exit=EACCES(Permission denied) a0=0xffffffffffffff9c a1=0xc00010e300 a2=0750 a3=0x7f1610425cc8 items=2 ppid=259214 pid=259245 auid=amaus uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=pts3 ses=3 comm=heartbeat exe=/home/amaus/work/elastic-agent-8.3.3-linux-x86_64444/data/elastic-agent-0ffbed/install/heartbeat-8.3.3-linux-x86_64/heartbeat subj=? key=elagent 

I also noticed that heartbeat defines the seccomp filter in:
https://github.com/elastic/beats/blob/main/heartbeat/security/policy_linux_amd64.go
that includes mkdirat.

I'm not familiar with the libbeat initialization order or the heartbeat or how the seccomp is utilized enough, but one possible concern that depending on the timing the paths initialization in the heartbeat could try to call mkdirat before this syscall was "allow-listed" with seccomp filter.

thoughts? maybe you or anybody who worked on the heartbeat closer might know about this.

@fearful-symmetry or @pierrehilbert I could dig further, or maybe you have somebody on the beats team who knows more about this area. I'll assign this ticket to you both for now, feel free to reassign as needed.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 10, 2022
@fearful-symmetry
Copy link
Contributor

So, found the issue, commenting out this line seems to fix it (or at least the instance of the issue I can easily reproduce):

_ = setCapabilities()

Considering that I only seem to hit this when running as root, I assume the issue is the result of us dropping capabilities that allow us to read/write from files/directories not owned by the user. Will investigate more tomorrow, although that line has been there for a while, so I wonder if the issue is a tad more subtle.

@emilioalvap
Copy link
Collaborator

@fearful-symmetry We shouldn't really be relying on capabilities to have access to the underlying file system afaik, that should be enabled by file permission levels. We had to do some changes back in 8.2 to fix those, this problem could be related to that.

I'll have a look at it too.

@emilioalvap
Copy link
Collaborator

@fearful-symmetry @aleksmaus Going back on my previous answer, it does seem us dropping CAP_DAC_OVERRIDE prevents creating files on user defined directories where user (root, in this case) does not have access by id/gid, traced back to 7.16.

This will be an issue on agent deployments if heartbeat is selected as the first beat to initialize data directories.

@andrewvc We could add CAP_DAC_OVERRIDE to our required capabilities, but we probably should change elastic-agent docker template to add it to the extracted beat. Wdyt?

@fearful-symmetry
Copy link
Contributor

This will be an issue on agent deployments if heartbeat is selected as the first beat to initialize data directories.

Yep, this is where I ran into this.

but we probably should change elastic-agent docker template to add it to the extracted beat.

Seconding this, although it needs to happen natively as well, and not just docker. I assume there isn't really any use case for running heartbeat as root, so we should look at some solution in elastic-agent itself and not just docker for making sure heartbeat has a coherent set of permissions.

@cmacknz cmacknz added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Oct 11, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 11, 2022
@andrewvc
Copy link
Contributor

As a heads-up @emilioalvap and I are going to discuss this in detail next week, there's a lot of moving pieces here.

@fearful-symmetry
Copy link
Contributor

@emilioalvap / @andrewvc is there an update on when a fix for this might get rolled out?

@emilioalvap
Copy link
Collaborator

@fearful-symmetry should have more info by EOD, we are going to discuss this one today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agent bug Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants