Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable heap snapshots for all our distributables #181363

Merged
merged 2 commits into from
Apr 23, 2024
Merged

Enable heap snapshots for all our distributables #181363

merged 2 commits into from
Apr 23, 2024

Conversation

rudolf
Copy link
Contributor

@rudolf rudolf commented Apr 22, 2024

Summary

Fixes #167955

To test:

# Build Kibana
node scripts/build --skip-os-packages --skip-cdn-assets --skip-canvas-shareable-runtime --skip-docker-ubi --skip-docker-ubuntu --skip-docker-fips --skip-node-download
# Run Kibana
cd build/default/kibana-8.15.0-SNAPSHOT-darwin-aarch64
./bin/kibana
# In a different terminal find the pid and send the signal
ps -e | grep bin/node
kill -s SIGUSR2 PID # use pid from ^
# verify heap snapshot is created
ls ./data
# expect to see a file like 
Heap.20240423.132053.37884.0.001.heapsnapshot

Checklist

Delete any items that are not applicable to this PR.

Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

Risk Probability Severity Mitigation/Notes
Multiple Spaces—unexpected behavior in non-default Kibana Space. Low High Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces.
Multiple nodes—Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. High Low Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure.
Code should gracefully handle cases when feature X or plugin Y are disabled. Medium High Unit tests will verify that any feature flag or plugin combination still results in our service operational.
See more potential risk examples

For maintainers

@rudolf rudolf added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc 8.15 candidate labels Apr 22, 2024
@rudolf rudolf marked this pull request as ready for review April 22, 2024 22:53
@rudolf rudolf requested review from a team as code owners April 22, 2024 22:53
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@rudolf
Copy link
Contributor Author

rudolf commented Apr 22, 2024

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@rudolf rudolf added the release_note:skip Skip the PR/issue when compiling release notes label Apr 23, 2024
Copy link
Contributor

@dokmic dokmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it in serverless -- everything works as expected. LGTM 👍

Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this! LGTM. Also tested locally after building.

@rudolf rudolf merged commit 26b8c71 into main Apr 23, 2024
18 checks passed
@rudolf rudolf deleted the heap-snapshot branch April 23, 2024 12:33
@kibanamachine kibanamachine added v8.15.0 backport:skip This commit does not require backporting labels Apr 23, 2024
jbudz added a commit that referenced this pull request Apr 24, 2024
@jbudz
Copy link
Member

jbudz commented Apr 24, 2024

@marius-dr found an issue with this on Windows. Reverted with f52db83.

 .\kibana.bat
node:internal/errors:541
      throw error;
      ^

TypeError [ERR_UNKNOWN_SIGNAL]: Unknown signal: SIGUSR2
    at initializeHeapSnapshotSignalHandlers (node:internal/process/pre_execution:430:34)
    at prepareExecution (node:internal/process/pre_execution:138:5)
    at prepareMainThreadExecution (node:internal/process/pre_execution:54:10)
    at node:internal/main/run_main_module:11:19 {
  code: 'ERR_UNKNOWN_SIGNAL'
}

@jbudz jbudz added the reverted label Apr 24, 2024
@marius-dr
Copy link
Member

marius-dr commented Apr 24, 2024

did some investigating on it. Looks like it's not supposed to work on Windows: nodejs/node#27133, and it's badly documented.

@jloleysens
Copy link
Contributor

Unfortunate, but looks like the issue is windows does not know about SIGUSR2... It is possible to capture heapsnapshots, we'll just have to figure out a different approach if we want this feature on by default for all platforms.

kpatticha pushed a commit to kpatticha/kibana that referenced this pull request Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.15 candidate backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes reverted Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ability to capture heap snapshots to all distributions
9 participants