Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix k8s scheduler compatibility issue #19699

Merged
merged 3 commits into from
Jul 8, 2020

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Jul 7, 2020

What does this PR do?

This PR introduces OpSetSuffix in order to solve a compatibility issue in Kubernetes Scheduler metricset.
In older version of k8s the Prometheus metric scheduler_pod_preemption_victims was a Gauge but recently they changed it
to a Histogram. Until now we have been mapping this Gauge metric to scheduling.pod.preemption.victims.count field. This makes it impossible for the module to parse Prometheus metrics from newer versions of Scheduler where this filed is a Histogram. With this PR OpSetSuffix can try to check if the value of this metric is numeric and extend it with counter suffix otherwise it is of Histogram type and we do nothing. This makes scheduling.pod.preemption.victims.count filed to be actually numeric in all cases since Histograms create this filed by default and when it is a Gauge we leverage the OpSetSuffix to do the mapping appropriately.

Why is it important?

In order to preserve compatibility with old and new versions of Prometheus metrics exported by Kubernetes Scheduler.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

@ChrsMark ChrsMark requested review from jsoriano and exekias July 7, 2020 12:53
@ChrsMark ChrsMark self-assigned this Jul 7, 2020
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 7, 2020
Signed-off-by: chrismark <[email protected]>
@ChrsMark ChrsMark added the Team:Platforms Label for the Integrations - Platforms team label Jul 7, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 7, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 7, 2020

💔 Tests Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #19699 updated]

  • Start Time: 2020-07-07T14:54:59.931+0000

  • Duration: 64 min 44 sec

Test stats 🧪

Test Results
Failed 1
Passed 2948
Skipped 686
Total 3635

Test errors

Expand to view the tests failures

  • Name: Build and Test / Metricbeat x-pack / Metricbeat x-pack / TestFetch – stats

    • Age: 2
    • Duration: 0
    • Error Details: Failed

Steps errors

Expand to view the steps failures

  • Name: Mage build test
    • Description: mage build test

    • Duration: 37 min 1 sec

    • Start Time: 2020-07-07T15:20:32.922+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-07-07T15:58:15.003Z] + OS=linux
[2020-07-07T15:58:15.003Z] + mkdir -p /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin
[2020-07-07T15:58:15.003Z] + curl -sSLo - https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip
[2020-07-07T15:58:15.952Z] ++ dirname /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin/terraform
[2020-07-07T15:58:15.952Z] + unzip -o /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin/terraform.zip -d /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin
[2020-07-07T15:58:15.952Z] Archive:  /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin/terraform.zip
[2020-07-07T15:58:16.212Z]   inflating: /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin/terraform  
[2020-07-07T15:58:16.212Z] + rm /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin/terraform.zip
[2020-07-07T15:58:16.212Z] + chmod +x /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/bin/terraform
[2020-07-07T15:58:16.567Z] + make mage
[2020-07-07T15:58:16.567Z] Installing mage v1.9.0.
[2020-07-07T15:58:17.137Z] go: finding github.com/magefile/mage v1.9.0
[2020-07-07T15:58:17.397Z] go: downloading github.com/magefile/mage v1.9.0
[2020-07-07T15:58:17.656Z] go: extracting github.com/magefile/mage v1.9.0
[2020-07-07T15:58:18.603Z] /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/.magefile cleaned
[2020-07-07T15:58:18.946Z] + git config --get user.email
[2020-07-07T15:58:18.946Z] + [ -z  ]
[2020-07-07T15:58:18.946Z] + git config user.email [email protected]
[2020-07-07T15:58:18.946Z] + git config user.name beatsmachine
[2020-07-07T15:58:19.342Z] + .ci/scripts/terraform-cleanup.sh x-pack/metricbeat
[2020-07-07T15:58:19.342Z] + DIRECTORY=x-pack/metricbeat
[2020-07-07T15:58:19.342Z] + FAILED=0
[2020-07-07T15:58:19.342Z] ++ find x-pack/metricbeat -name terraform.tfstate
[2020-07-07T15:58:19.342Z] + exit 0
[2020-07-07T15:58:19.827Z] + curl -sSLo codecov https://codecov.io/bash
[2020-07-07T15:58:20.086Z] + FILE=auditbeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f auditbeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.086Z] + FILE=filebeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f filebeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.086Z] + FILE=heartbeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f heartbeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.086Z] + FILE=libbeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f libbeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.086Z] + FILE=metricbeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f metricbeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.086Z] + FILE=packetbeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f packetbeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.086Z] + FILE=winlogbeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f winlogbeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.086Z] + FILE=journalbeat/build/coverage/full.cov
[2020-07-07T15:58:20.086Z] + [ -f journalbeat/build/coverage/full.cov ]
[2020-07-07T15:58:20.619Z] Failed in branch Metricbeat x-pack
[2020-07-07T15:58:20.787Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats
[2020-07-07T15:58:21.110Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-07-07T15:58:21.123Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Lint
[2020-07-07T15:58:21.226Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-OSS-Integration-tests
[2020-07-07T15:58:21.322Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-Python-integration-tests
[2020-07-07T15:58:21.413Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-OSS-Unit-tests
[2020-07-07T15:58:21.492Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-Mac-OS-X
[2020-07-07T15:58:21.585Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-crosscompile
[2020-07-07T15:58:21.674Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack-Mac-OS-X
[2020-07-07T15:58:21.758Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-Windows
[2020-07-07T15:58:21.837Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack-Windows
[2020-07-07T15:58:21.921Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack
[2020-07-07T15:58:22.278Z] + cat
[2020-07-07T15:58:22.279Z] + /usr/local/bin/runbld ./runbld-script
[2020-07-07T15:58:22.279Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-07-07T15:58:28.891Z] runbld>>> runbld started
[2020-07-07T15:58:28.891Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-07-07T15:58:30.811Z] runbld>>> The following profiles matched the job 'Beats/beats-beats-mbp/PR-19699' in order of occurrence in the config (last value wins).
[2020-07-07T15:58:32.196Z] runbld>>> Debug logging enabled.
[2020-07-07T15:58:32.196Z] runbld>>> Storing result
[2020-07-07T15:58:32.196Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-07-07T15:58:32.196Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200707155831-87FDCD79
[2020-07-07T15:58:32.196Z] runbld>>> Adding system facts.
[2020-07-07T15:58:33.145Z] runbld>>> Adding vcs info for the latest commit:  bf64e8fa7755f12d681ad908726c87bb5faa5b39
[2020-07-07T15:58:33.145Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-07-07T15:58:33.145Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-07-07T15:58:33.407Z] Processing JUnit reports with runbld...
[2020-07-07T15:58:33.407Z] + echo 'Processing JUnit reports with runbld...'
[2020-07-07T15:58:33.671Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-07-07T15:58:33.671Z] runbld>>> DURATION: 23ms
[2020-07-07T15:58:33.671Z] runbld>>> STDOUT: 40 bytes
[2020-07-07T15:58:33.671Z] runbld>>> STDERR: 49 bytes
[2020-07-07T15:58:33.671Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-07-07T15:58:33.671Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats
[2020-07-07T15:58:35.066Z] runbld>>> Storing build metadata: 
[2020-07-07T15:58:35.066Z] runbld>>> Adding test report.
[2020-07-07T15:58:35.066Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats
[2020-07-07T15:58:35.637Z] runbld>>> Found 32 test output files
[2020-07-07T15:58:35.899Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-istio.xml
[2020-07-07T15:58:35.899Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-tomcat.xml
[2020-07-07T15:58:35.899Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-iis.xml
[2020-07-07T15:58:35.899Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-openmetrics.xml
[2020-07-07T15:58:35.899Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-activemq.xml
[2020-07-07T15:58:36.472Z] runbld>>> Test output logs contained: Errors: 0 Failures: 1 Tests: 3635 Skipped: 597
[2020-07-07T15:58:36.733Z] runbld>>> Storing result
[2020-07-07T15:58:36.733Z] runbld>>> FAILURES: 1
[2020-07-07T15:58:36.994Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-07-07T15:58:36.994Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200707155831-87FDCD79
[2020-07-07T15:58:37.255Z] runbld>>> Email notification disabled by environment variable.
[2020-07-07T15:58:37.255Z] runbld>>> Slack notification disabled by environment variable.
[2020-07-07T15:58:42.806Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-19699
[2020-07-07T15:58:42.908Z] [INFO] getVaultSecret: Getting secrets
[2020-07-07T15:58:42.973Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-07-07T15:58:43.737Z] + chmod 755 generate-build-data.sh
[2020-07-07T15:58:43.737Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19699/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19699/runs/3 FAILURE 3823545
[2020-07-07T15:58:43.737Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19699/runs/3/steps/?limit=10000 -o steps-info.json
[2020-07-07T15:58:44.648Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19699/runs/3/tests/?status=FAILED -o tests-errors.json
[2020-07-07T15:58:44.898Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-19699/runs/3/log/ -o pipeline-log.txt

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting solution to this problem 👍 I have added some small suggestions.

CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved
metricbeat/helper/prometheus/metric.go Outdated Show resolved Hide resolved
metricbeat/module/kubernetes/scheduler/scheduler.go Outdated Show resolved Hide resolved
@ChrsMark ChrsMark changed the title Fix k8s scheduler compatability issue Fix k8s scheduler compatibility issue Jul 7, 2020
Signed-off-by: chrismark <[email protected]>
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@exekias exekias added the needs_backport PR is waiting to be backported to other branches. label Jul 7, 2020
@exekias
Copy link
Contributor

exekias commented Jul 8, 2020

thank you for fixing this @ChrsMark !

@ChrsMark ChrsMark merged commit aa60a58 into elastic:master Jul 8, 2020
ChrsMark added a commit to ChrsMark/beats that referenced this pull request Jul 8, 2020
@ChrsMark ChrsMark removed the needs_backport PR is waiting to be backported to other branches. label Jul 8, 2020
ChrsMark added a commit that referenced this pull request Jul 8, 2020
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug review Team:Platforms Label for the Integrations - Platforms team v7.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

k8s scheduler broken mapping
5 participants