Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kibana: Set default hardened security context #8086

Merged
merged 15 commits into from
Nov 5, 2024

Conversation

naemono
Copy link
Contributor

@naemono naemono commented Oct 8, 2024

Closes: #7787

What is this change?

This sets the security context for Kibana to be more secure/hardened by default.

Testing/Todo

  • Test v8 Kibana in e2e tests (passed here)
  • Test v7 Kibana in e2e tests (passed here)
  • Test in OCP e2e tests (passed here)

Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
@botelastic botelastic bot added the triage label Oct 8, 2024
Signed-off-by: Michael Montgomery <[email protected]>
@naemono
Copy link
Contributor Author

naemono commented Oct 8, 2024

buildkite test this -f p=gke -m s=7.17.8,s=8.15.2,s=8.16.0-SNAPSHOT

@thbkrkr thbkrkr added the >enhancement Enhancement of existing functionality label Oct 9, 2024
@botelastic botelastic bot removed the triage label Oct 9, 2024
@thbkrkr
Copy link
Contributor

thbkrkr commented Oct 9, 2024

buildkite test this -f s=8.16.0-SNAPSHOT,E2E_TAGS=kb -m p=eks,p=aks,p=ocp

(I stopped the build before the end). Result: issue on test TestKBStackMonitoring on ocp.

{
    "Time": "2024-10-09T09:20:54.911004408Z",
    "Action": "output",
    "Package": "github.com/elastic/cloud-on-k8s/v2/test/e2e/kb",
    "Test": "TestKBStackMonitoring/Kibana_Pods_should_eventually_be_ready",
    "Output": "=== RUN   TestKBStackMonitoring/Kibana_Pods_should_eventually_be_ready"
}
> k get po test-kb-mon-a-jcrc-kb-7f456d5f6-5fqpm
NAME                                    READY   STATUS                       RESTARTS   AGE
test-kb-mon-a-jcrc-kb-7f456d5f6-5fqpm   1/3     CreateContainerConfigError   0          30m
# > k get po test-kb-mon-a-jcrc-kb-7f456d5f6-5fqpm  -o yaml | yq '.status.containerStatuses[] | select(.started == false)'
image: docker.elastic.co/beats/filebeat:8.16.0-SNAPSHOT
imageID: ""
lastState: {}
name: filebeat
ready: false
restartCount: 0
started: false
state:
  waiting:
    message: 'container has runAsNonRoot and image has non-numeric user (filebeat), cannot verify user is non-root (pod: "test-kb-mon-a-jcrc-kb-7f456d5f6-5fqpm_e2e-3ijnz-mercury(75bccf29-bf65-4b6c-8409-cbb333a25076)", container: filebeat)'
    reason: CreateContainerConfigError
image: docker.elastic.co/beats/metricbeat:8.16.0-SNAPSHOT
imageID: ""
lastState: {}
name: metricbeat
ready: false
restartCount: 0
started: false
state:
  waiting:
    message: 'container has runAsNonRoot and image has non-numeric user (metricbeat), cannot verify user is non-root (pod: "test-kb-mon-a-jcrc-kb-7f456d5f6-5fqpm_e2e-3ijnz-mercury(75bccf29-bf65-4b6c-8409-cbb333a25076)", container: metricbeat)'
    reason: CreateContainerConfigError

I don't yet understand the link between the change in this PR and this error.

@naemono
Copy link
Contributor Author

naemono commented Oct 10, 2024

I don't yet understand the link between the change in this PR and this error.

Thanks @thbkrkr I will investigate tomorrow and get to the bottom of it.

@naemono
Copy link
Contributor Author

naemono commented Oct 10, 2024

I don't yet understand the link between the change in this PR and this error.

Thanks @thbkrkr I will investigate tomorrow and get to the bottom of it.

I've duplicated this issue:

  - image: docker.elastic.co/beats/metricbeat:8.16.0-SNAPSHOT
    imageID: ""
    lastState: {}
    name: metricbeat
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        message: 'container has runAsNonRoot and image has non-numeric user (metricbeat),
          cannot verify user is non-root (pod: "test-kb-mon-a-6gpk-kb-7b98f59fff-dvw7f_e2e-mercury(289f153f-cfc8-454a-be52-a238b3258640)",
          container: metricbeat)'
        reason: CreateContainerConfigError

Investigating...

@naemono
Copy link
Contributor Author

naemono commented Oct 10, 2024

Investigating...

I suspect the recent wolfi changes broke this: elastic/beats@b06f7ce#diff-fee1e8c36ba46c30289f0ce62904e6cf695d7f1ac327d92588b0a73b188e2f0f

This was predicted here

Verifying, and I'll work with Beats team to get fixed.

@naemono
Copy link
Contributor Author

naemono commented Oct 16, 2024

buildkite test this -f s=8.16.0-SNAPSHOT,E2E_TAGS=kb -m p=ocp

@pebrc
Copy link
Collaborator

pebrc commented Oct 17, 2024

buildkite test this -f s=8.16.0-SNAPSHOT,E2E_TAGS=kb -m p=ocp

1 similar comment
@naemono
Copy link
Contributor Author

naemono commented Oct 22, 2024

buildkite test this -f s=8.16.0-SNAPSHOT,E2E_TAGS=kb -m p=ocp

@naemono naemono marked this pull request as ready for review October 24, 2024 12:56
@pebrc
Copy link
Collaborator

pebrc commented Oct 24, 2024

I think you need to version gate this to 7.x or greater. We still need to support 6.x and Kibana does not work with the restricitive security context in 6.x because it tries to write to the root file system.

fs.js:114
    throw err;
    ^

Error: EROFS: read-only file system, open '/usr/share/kibana/optimize/.babelcache.json'
    at Object.openSync (fs.js:443:3)
    at Object.writeFileSync (fs.js:1194:35)
    at save (/usr/share/kibana/node_modules/babel-register/lib/cache.js:48:16)
    at process._tickCallback (internal/process/next_tick.js:61:11)
    at Function.Module.runMain (internal/modules/cjs/loader.js:834:11)
    at startup (internal/bootstrap/node.js:283:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3)

@pebrc
Copy link
Collaborator

pebrc commented Oct 24, 2024

Also in more recent versions. I am testing 7.13 here it seems while Kibana comes up with the security context. The reporting feature will be broken:

Error: EROFS: read-only file system, mkdtemp '/tmp/chromium-XXXXXX'\n    at Object.mkdtempSync (fs.js:1954:3)\n    at new HeadlessChromiumDriverFactory (/usr/share/kibana/x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.js:131:36)\n    at Object.createDriverFactory (/usr/share/kibana/x-pack/plugins/reporting/server/browsers/chromium/index.js:29:63)\n    at initializeBrowserDriverFactory (/usr/share/kibana/x-pack/plugins/reporting/server/browsers/index.js:49:29)\n    at runMicrotasks (<anonymous>)\n    at processTicksAndRejections (internal/process/task_queues.js:93:5)\n    at /usr/share/kibana/x-pack/plugins/reporting/server/plugin.js:113:36 {\n  errno: -30,\n  syscall: 'mkdtemp',\n  code: 'EROFS',\n  path: '/tmp/chromium-XXXXXX'\n}"}
{"type":"log","@timestamp":"2024-10-24T13:46:06+00:00","tags":["info","plugins","securitySolution"],"pid":951,"message":"Dependent plugin setup complete - Starting ManifestTask"}

So this requires probably an emptyDir volume mounted in /tmp in order to work.

Oddly the status/api says reporting is health, not sure if this is trust worthy though

  {
        "id": "plugin:[email protected]",
        "message": "All dependencies are available",
        "since": "2024-10-24T13:51:51.188Z",
        "state": "green",
        "icon": "success",
        "uiColor": "secondary"
      },

@pebrc
Copy link
Collaborator

pebrc commented Oct 24, 2024

I think you need to version gate this to 7.x or greater. We still need to support 6.x and Kibana does not work with the restricitive security context in 6.x because it tries to write to the root file system.

fs.js:114
    throw err;
    ^

Error: EROFS: read-only file system, open '/usr/share/kibana/optimize/.babelcache.json'
    at Object.openSync (fs.js:443:3)
    at Object.writeFileSync (fs.js:1194:35)
    at save (/usr/share/kibana/node_modules/babel-register/lib/cache.js:48:16)
    at process._tickCallback (internal/process/next_tick.js:61:11)
    at Function.Module.runMain (internal/modules/cjs/loader.js:834:11)
    at startup (internal/bootstrap/node.js:283:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3)

This is also still the case for 7.0. So I think you have to run an upgrade test all the way up to the 7.x version you tested in CI to figure out as of which version Kibana stops writing into random places in the container.

@naemono
Copy link
Contributor Author

naemono commented Oct 25, 2024

From my testing @pebrc 7.5.0 is the first version I see that seems to work fully with the restricted sec context, and a /tmp volume, although I'm going to continue testing upgrades throughout the 7.5+ versions to make sure of this.

@naemono
Copy link
Contributor Author

naemono commented Oct 25, 2024

I've tested every minor version from 7.0 -> 7.17.0 and these are my findings:

From 7.6.0 -> 7.9.x we see:

Babel could not write cache to file: /usr/share/kibana/optimize/.babel_register_cache.json
because it resides in a readonly filesystem. Cache is disabled.

I will reach out about what this means, and whether it's a good thing to enable in this version range.

Seems to stop mentioning this in 7.10.

At 7.15 I see this:

{"type":"log","@timestamp":"2024-10-25T16:13:29+00:00","tags":["info","plugins","ruleRegistry"],"pid":1215,"message":"Write is disabled; not installing common resources shared between all indices"}
{"type":"log","@timestamp":"2024-10-25T16:13:30+00:00","tags":["info","plugins","ruleRegistry"],"pid":1215,"message":"Write is disabled; not installing resources for index .alerts-observability.uptime.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:13:30+00:00","tags":["info","plugins","ruleRegistry"],"pid":1215,"message":"Write is disabled; not installing resources for index .alerts-observability.logs.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:13:30+00:00","tags":["info","plugins","ruleRegistry"],"pid":1215,"message":"Write is disabled; not installing resources for index .alerts-observability.metrics.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:13:30+00:00","tags":["info","plugins","ruleRegistry"],"pid":1215,"message":"Write is disabled; not installing resources for index .alerts-observability.apm.alerts"}

I'm unsure what effects this may have. Will reach out. I don't see this in 7.16.x/7.17.x, I see this:

{"type":"log","@timestamp":"2024-10-25T16:18:44+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installed common resources shared between all indices"}
{"type":"log","@timestamp":"2024-10-25T16:18:44+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installing resources for index .alerts-observability.uptime.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:44+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installing resources for index .alerts-observability.logs.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:44+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installing resources for index .alerts-observability.metrics.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:44+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installing resources for index .alerts-observability.apm.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:46+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installed resources for index .alerts-observability.apm.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:46+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installed resources for index .alerts-observability.uptime.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:46+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installed resources for index .alerts-observability.metrics.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:46+00:00","tags":["info","plugins","ruleRegistry"],"pid":7,"message":"Installed resources for index .alerts-observability.logs.alerts"}
{"type":"log","@timestamp":"2024-10-25T16:18:47+00:00","tags":["info","plugins","securitySolution"],"pid":7,"message":"Dependent plugin setup complete - Starting ManifestTask"}

Update logic issue with default builder.

Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
@naemono
Copy link
Contributor Author

naemono commented Oct 25, 2024

From 7.6.0 -> 7.9.x we see:

Babel could not write cache to file: /usr/share/kibana/optimize/.babel_register_cache.json
because it resides in a readonly filesystem. Cache is disabled.

We're told that this has no runtime impact. The only impact it would have is if the same pod on the same pod was restarted, the startup times would be slightly slower.

@naemono
Copy link
Contributor Author

naemono commented Oct 25, 2024

At 7.15 I see this:

{"type":"log","@timestamp":"2024-10-25T16:13:29+00:00","tags":["info","plugins","ruleRegistry"],"pid":1215,"message":"Write is disabled; not installing common resources shared between all indices"}

We're been instructed that these messages can be ignored.

pkg/controller/kibana/driver_test.go Outdated Show resolved Hide resolved
@naemono
Copy link
Contributor Author

naemono commented Oct 29, 2024

@pebrc there's a final piece to this from working with the Kibana team. We can't set the root read-only without additionally having a 'plugins' emptDir setup. From what I see we don't test any plugin installations in either Elasticsearch or Kibana, so our e2e tests didn't catch this. I'm going to see what it would take to test a simple plugin installation in our e2e tests.

@naemono
Copy link
Contributor Author

naemono commented Oct 30, 2024

I'm going to see what it would take to test a simple plugin installation in our e2e tests.

This likely isn't going to happen in this PR, as there's really no plugin that we'd want to enable that is released for every version of kibana we want to support. In testing manually the plugin installation, I've run into a bit of an issue I'm working through and will update shortly.

Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
@naemono
Copy link
Contributor Author

naemono commented Oct 30, 2024

@pebrc Ok, tested the most recent changes manually, since e2e approach wasn't going to be supportable over many versions of kibana+plugin.

Tested using Kibana version 8.11.4

❯ kc get kb -n elastic testing -o yaml | yq '.spec.podTemplate.spec.initContainers'
- command:
    - sh
    - -c
    - |
      bin/kibana-plugin install https://github.com/fbaligand/kibana-enhanced-table/releases/download/v1.14.0/enhanced-table-1.14.0_8.11.4.zip
  name: install-plugins
  resources: {}

❯ kc logs -n elastic testing-kb-7b597b5bc9-nf96k -c install-plugins
Attempting to transfer from https://github.com/fbaligand/kibana-enhanced-table/releases/download/v1.14.0/enhanced-table-1.14.0_8.11.4.zip
Transferring 1041964 bytes....................
Transfer complete
Retrieving metadata from plugin archive
Extracting plugin archive
Extraction complete
Plugin installation complete

❯ kc get pod -n elastic
NAME                          READY   STATUS    RESTARTS   AGE
testing-es-masters-0          1/1     Running   0          21h
testing-es-masters-1          1/1     Running   0          21h
testing-es-masters-2          1/1     Running   0          21h
testing-kb-7b597b5bc9-nf96k   1/1     Running   0          4m41s

❯ kc get pod -n elastic -l common.k8s.elastic.co/type=kibana -o yaml | yq '.items[0].spec.volumes[]|select(.name == "kibana-plugins")'
emptyDir: {}
name: kibana-plugins

Pods come online without issues.

I also tested this on 7.10.2, and all went without issues.

pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
@pebrc pebrc added the v2.16.0 label Oct 31, 2024
Adjust comments.

Signed-off-by: Michael Montgomery <[email protected]>
@naemono naemono merged commit 46b5630 into elastic:main Nov 5, 2024
5 checks passed
@naemono naemono deleted the kibana-default-security-context branch November 5, 2024 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality v2.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hardened Security Context for Kibana
3 participants