Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not getting all logs from build #861

Open
BradyShober opened this issue Jun 7, 2024 · 14 comments
Open

Not getting all logs from build #861

BradyShober opened this issue Jun 7, 2024 · 14 comments
Labels
backend/elastic Issue specific to the Elastic backend bug Something isn't working

Comments

@BradyShober
Copy link

Jenkins and plugins versions report

Environment
Jenkins: 2.440.3
OS: Linux - 5.10.216-204.855.amzn2.x86_64
Java: 17.0.10 - Eclipse Adoptium (OpenJDK 64-Bit Server VM)
---
amazon-ecs:1.49
analysis-model-api:12.1.0
ant:497.v94e7d9fffa_b_9
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
artifact-manager-s3:843.v87eb_5414d596
artifactory:4.0.6
asm-api:9.7-33.v4d23ef79fcc8
atlassian-bitbucket-server-integration:4.0.0
authentication-tokens:1.53.v1c90fd9191a_b_
aws-credentials:231.v08a_59f17d742
aws-global-configuration:130.v35b_7b_96f53c3
aws-java-sdk:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-api-gateway:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-autoscaling:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-cloudformation:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-cloudfront:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-codebuild:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-codedeploy:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-ec2:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-ecr:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-ecs:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-efs:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-elasticbeanstalk:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-elasticloadbalancingv2:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-iam:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-kinesis:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-lambda:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-logs:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-minimal:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-organizations:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-secretsmanager:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-sns:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-sqs:1.12.696-451.v0651a_da_9ca_ec
aws-java-sdk-ssm:1.12.696-451.v0651a_da_9ca_ec
aws-secrets-manager-credentials-provider:1.214.va_0a_d8268d068
aws-secrets-manager-secret-source:1.72.v61781b_35c542
bitbucket:241.v6d24a_57f9359
blueocean:1.27.9
blueocean-autofavorite:1.2.5
blueocean-bitbucket-pipeline:1.27.9
blueocean-commons:1.27.9
blueocean-config:1.27.9
blueocean-core-js:1.27.9
blueocean-dashboard:1.27.9
blueocean-display-url:2.4.2
blueocean-events:1.27.9
blueocean-git-pipeline:1.27.9
blueocean-github-pipeline:1.27.9
blueocean-i18n:1.27.9
blueocean-jira:1.27.9
blueocean-jwt:1.27.9
blueocean-personalization:1.27.9
blueocean-pipeline-api-impl:1.27.9
blueocean-pipeline-editor:1.27.9
blueocean-pipeline-scm-api:1.27.9
blueocean-rest:1.27.9
blueocean-rest-impl:1.27.9
blueocean-web:1.27.9
bootstrap5-api:5.3.3-1
bouncycastle-api:2.30.1.77-225.v26ea_c9455fd9
branch-api:2.1152.v6f101e97dd77
build-name-setter:2.4.2
build-timeout:1.32
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.2.0
cloudbees-bitbucket-branch-source:883.v041fa_695e9c2
cloudbees-folder:6.858.v898218f3609d
command-launcher:107.v773860566e2e
commons-lang3-api:3.13.0-62.v7d18e55f51e2
commons-text-api:1.11.0-95.v22a_d30ee5d36
config-file-provider:968.ve1ca_eb_913f8c
configuration-as-code:1775.v810dc950b_514
copyartifact:722.v0662a_9b_e22a_c
credentials:1337.v60b_d7b_c7b_c9f
credentials-binding:657.v2b_19db_7d6e6d
custom-tools-plugin:0.8
data-tables-api:2.0.3-1
display-url-api:2.204.vf6fddd8a_8b_e9
docker-commons:439.va_3cb_0a_6a_fb_29
docker-workflow:572.v950f58993843
dotnet-sdk:1.4.0
dtkit-api:3.0.2
durable-task:550.v0930093c4b_a_6
ec2:1648.vf3d852e00486
ec2-fleet:3.0.2
echarts-api:5.5.0-1
email-ext:2.105
emailext-template:1.5
embeddable-build-status:467.v4a_954796e45d
extended-choice-parameter:382.v5697b_32134e8
extended-read-permission:53.v6499940139e5
favorite:2.208.v91d65b_7792a_c
font-awesome-api:6.5.1-3
forensics-api:2.3.0
git:5.2.1
git-client:4.7.0
github:1.38.0
github-api:1.318-461.v7a_c09c9fa_d63
github-branch-source:1787.v8b_8cd49a_f8f1
gradle:2.11
gson-api:2.10.1-15.v0d99f670e0a_7
handy-uri-templates-2-api:2.1.8-30.v7e777411b_148
htmlpublisher:1.33
http_request:1.18
instance-identity:185.v303dc7c645f9
ionicons-api:70.v2959a_b_74e3cf
jackson2-api:2.17.0-379.v02de8ec9f64c
jakarta-activation-api:2.1.3-1
jakarta-mail-api:2.1.3-1
javadoc:243.vb_b_503b_b_45537
javax-activation-api:1.2.0-6
javax-mail-api:1.6.2-9
jaxb:2.3.9-1
jenkins-design-language:1.27.12
jersey2-api:2.42-147.va_28a_44603b_d5
jira:3.13
jjwt-api:0.11.5-112.ve82dfb_224b_a_d
job-dsl:1.87
jobcacher:481.v15f51ca_4c6b_7
joda-time-api:2.12.7-29.v5a_b_e3a_82269a_
jquery3-api:3.7.1-2
jsch:0.2.16-86.v42e010d9484b_
json-api:20240303-41.v94e11e6de726
json-path-api:2.9.0-58.v62e3e85b_a_655
junit:1265.v65b_14fa_f12f0
mailer:472.vf7c289a_4b_420
matrix-auth:3.2.2
matrix-project:822.824.v14451b_c0fd42
maven-plugin:3.23
mercurial:1260.vdfb_723cdcc81
metrics:4.2.21-451.vd51df8df52ec
mina-sshd-api-common:2.12.1-101.v85b_e08b_780dd
mina-sshd-api-core:2.12.1-101.v85b_e08b_780dd
msbuild:1.31
mstest:1.0.5
node-iterator-api:55.v3b_77d4032326
nodejs:1.6.1
okhttp-api:4.11.0-172.vda_da_1feeb_c6e
opentelemetry:3.1138.v80fc844ed246
pam-auth:1.10
parameterized-trigger:787.v665fcf2a_830b_
pipeline-aws:1.45
pipeline-build-step:540.vb_e8849e1a_b_d8
pipeline-graph-analysis:216.vfd8b_ece330ca_
pipeline-graph-view:241.v3d46398a_328c
pipeline-groovy-lib:704.vc58b_8890a_384
pipeline-input-step:495.ve9c153f6067b_
pipeline-milestone-step:119.vdfdc43fc3b_9a_
pipeline-model-api:2.2198.v41dd8ef6dd56
pipeline-model-definition:2.2198.v41dd8ef6dd56
pipeline-model-extensions:2.2198.v41dd8ef6dd56
pipeline-rest-api:2.34
pipeline-stage-step:312.v8cd10304c27a_
pipeline-stage-tags-metadata:2.2198.v41dd8ef6dd56
pipeline-stage-view:2.34
pipeline-utility-steps:2.16.2
plain-credentials:179.vc5cb_98f6db_38
plugin-util-api:4.1.0
powershell:2.1
prism-api:1.29.0-13
pubsub-light:1.18
resource-disposer:0.23
role-strategy:717.v6a_69a_fe98974
run-condition:1.7
saml:4.464.vea_cb_75d7f5e0
scm-api:689.v237b_6d3a_ef7f
script-security:1335.vf07d9ce377a_e
snakeyaml-api:2.2-111.vc6598e30cc65
sonar:2.17.2
sse-gateway:1.26
ssh-agent:367.vf9076cd4ee21
ssh-credentials:337.v395d2403ccd4
ssh-slaves:2.948.vb_8050d697fec
sshd:3.322.v159e91f6a_550
structs:337.v1b_04ea_4df7c8
timestamper:1.26
token-macro:400.v35420b_922dcb_
trilead-api:2.142.v748523a_76693
variant:60.v7290fc0eb_b_cd
workflow-aggregator:596.v8c21c963d92d
workflow-api:1291.v51fd2a_625da_7
workflow-basic-steps:1049.v257a_e6b_30fb_d
workflow-cps:3894.vd0f0248b_a_fc4
workflow-durable-task-step:1336.v768003e07199
workflow-job:1400.v7fd111b_ec82f
workflow-multibranch:770.v1a_d0708dd1f6
workflow-scm-step:427.v4ca_6512e7df1
workflow-step-api:657.v03b_e8115821b_
workflow-support:896.v175a_a_9c5b_78f
ws-cleanup:0.45
xunit:3.1.4

What Operating System are you using (both controller, and any agents involved in the problem)?

Controller is running in a container based on jenkins/jenkins:2.440.3-lts-alpine
Agents are Windows Server 2022 and Debian 11

Reproduction steps

  1. Using the EC2 fleet plugin (https://plugins.jenkins.io/ec2-fleet/) define a Cloud using an Auto Scaling Group
  2. Do a build that runs on that Cloud
  3. Configure OpenTelemetry to push logs to Elastic and be viewable in Jenkins
  4. Run the same build again

Expected Results

We would expect to see the same logs within Jenkins as we did before

Actual Results

The logs getting pulled in have essentially no useful content. If you use the "View logs in" link in the console output, you see the same logs as it is pulling in, however I have found that if from the build you use the "View pipeline with" link that I am able to see the logs I expect to see.
An example of what is getting pulled in:

13:03:49  [Pipeline] Start of Pipeline
13:03:49  [Pipeline] node
13:09:34  [Pipeline] {
13:09:34  [Pipeline] withEnv
13:09:35  [Pipeline] {
13:09:35  [Pipeline] stage
13:09:35  [Pipeline] { (Prepare)
13:09:35  [Pipeline] cleanWs
13:09:35  [Pipeline] checkout
13:09:35  [Pipeline] withSonarQubeEnv
13:09:36  [Pipeline] {
13:09:36  [Pipeline] tool
13:09:37  [Pipeline] bat
13:09:50  [Pipeline] }
13:09:50  [Pipeline] // withSonarQubeEnv
13:09:50  [Pipeline] }
13:09:50  [Pipeline] // stage
13:09:50  [Pipeline] stage
13:09:50  [Pipeline] { (Restore)
13:09:50  [Pipeline] bat
13:10:03  [Pipeline] }
13:10:03  [Pipeline] // stage
13:10:03  [Pipeline] stage
13:10:03  [Pipeline] { (Build)
13:10:03  Stage "Build" skipped due to earlier failure(s)
13:10:03  [Pipeline] getContext
13:10:03  [Pipeline] }
13:10:03  [Pipeline] // stage
13:10:04  [Pipeline] stage
13:10:04  [Pipeline] { (Test)
13:10:04  Stage "Test" skipped due to earlier failure(s)
13:10:04  [Pipeline] getContext
13:10:04  [Pipeline] }
13:10:04  [Pipeline] // stage
13:10:04  [Pipeline] stage
13:10:04  [Pipeline] { (Publish)
13:10:04  Stage "Publish" skipped due to earlier failure(s)
13:10:04  [Pipeline] getContext
13:10:04  [Pipeline] }
13:10:05  [Pipeline] // stage
13:10:05  [Pipeline] }
13:10:05  [Pipeline] // withEnv
13:10:05  [Pipeline] }
13:10:05  [Pipeline] // node
13:10:05  [Pipeline] End of Pipeline

Anything else?

No response

Are you interested in contributing a fix?

No response

@BradyShober BradyShober added the bug Something isn't working label Jun 7, 2024
@cyrille-leclerc cyrille-leclerc added the backend/elastic Issue specific to the Elastic backend label Jun 13, 2024
@cyrille-leclerc
Copy link
Contributor

Hello @BradyShober,
Thanks for using the Jenkins OpenTelemetry plugin, we are sorry for the pain point you are encountering.
Is it correct that you only view in Elastic the build logs produced by the Jenkins controller (ie the logs produced by your Jenkinsfile groovy code) but that you don't see the build logs produced on the Jenkins build agent (logs of your build scripts...).
If so, please keep in mind that the Jenkins build agents directly send their OpenTelemetry data (including logs) to the OTLP endpoint without going through the Jenkins controller so the OTLP endpoint MUST be reachable from you build agents, EC2 Fleet build agents in your case.
It's likely that the OpenTelemetry endpoint you defined in the Jenkins Controller config is not reachable by your EC2 Fleet agent, either because the OTLP endpoint hostname is not resolved by the DNS of your agents or due to firewalling reasons.

Does it make sense?

@BradyShober
Copy link
Author

BradyShober commented Jun 13, 2024

Hi @cyrille-leclerc,
Allow me to provide some clarifying information along with some screenshots. I have a very simple example pipeline

pipeline {
    agent {
        label 'linux-small'
    }

    stages {
        stage('Hello') {
            steps {
                sh 'java --version'
            }
        }
    }
}

When we choose to not ship logs to Elastic, this is the log output of this pipeline:
image

When we enable log shipping to Elastic, this is the output:
image

If I use the link in the log outputs to Kibana, I see matching output:
image

However, I can find the expected output by using the link on the build page and going to the Logs tab
image

@kuisathaverat
Copy link
Contributor

As Cyrille commented, you only see the logs that come from the Jenkins Controller. That means your build Agents do not have access to the Backend, so they can not send the logs generated by the build Agent. So check that you can reach your log backend from the Build agents, then give them the rules you need in your infrastructure to allow them to reach the Backend.

@BradyShober
Copy link
Author

@kuisathaverat I hadn't explicitly mentioned it in my previous reply, but I did verify connectivity from the agent to the backend. Also if you refer to my last point in my previous reply, with some digging I can locate the logs I expect to see in elastic, something just must not be getting set appropriately to tie them to the build to get them pulled in correctly.

@kuisathaverat
Copy link
Contributor

I see, cheking the latest screenshot, the timestamp of the record in backend is in the past compared with the timestamp of the Jenkins record. There are 2 msec of difference but only in the agent.

10:09:13.732    [2024-06-13T15:09.13.732Z] [pipeline]  sh   
10:09:13.881    [2024-06-13T15:09:13.883Z] + java --version

I do not think it is a problem, I just noticed it. I wonder if it could be related to the range of time we set in the link. Your pipeline is simple and it takes only a second to run the whole pipeline. Could you try to add a sh('sleep 30') to your pipeline and try again? Also, when you go to the link, Could verify the time range selected to confirm match the pipeline execution?

@BradyShober
Copy link
Author

@kuisathaverat I should mention that yes while my sample pipeline is very quick, that is just to illustrate the issue with a simple example and this is behavior that was observed on actual builds that do take more time. I did go ahead and add the sleep and run the test again.
Once again, following the link on the console output page I do not see the output I am expecting, and I see that the timeframe this sets is to look back 40 days
image

If I go looking with the other link on the main build page and go to the Logs tab I do see the output
image

I also checked the attributes of the logs in elastic to see if anything stood out to me, the main differences I see that may be making a difference is that the ones that are not getting pulled in do not have span.id or trace.id attributes.

@kuisathaverat
Copy link
Contributor

kuisathaverat commented Jul 5, 2024

the main differences I see that may be making a difference is that the ones that are not getting pulled in do not have span.id or trace.id attributes.

Does this mean it does not happen in all job builds? I will try to replicate it

@kuisathaverat
Copy link
Contributor

I have tested with the demo, and the following environment works as expected:

Jenkins Core 2.452.2 [LTS]
Opentelemetry plugin 3.1310.vfe6b_821a_4ed2
Elastic Stack 8.10.1/8.14.2

Only when the agent loses connectivity with the backend, those logs are not sent. All the fields are filled in any case I did not find a way that the span.id or trace.id are not processed.

@cyrille-leclerc
Copy link
Contributor

Can you please test the connectivity from the Jenkins build agent to the OTLP endpoint:

  • Upgrade the Jenkins OTel Plugin and the Jenkins OTel API Plugin to the latest release
  • In Jenkins admin GUI, navigate to "Manage Jenkins / Nodes"
  • Click on the build agent node to verify, for example "jnlp_agent" in the screenshot example below
  • Navigate to the "Script Console" screen
  • Execute the test command
io.jenkins.plugins.opentelemetry.api.util.OpenTelemetryUtils.testLogRecordExporter()
  • The result should look like
Result: testLogRecordExporter(): result(success: truedone: true), 
    OtlpGrpcLogRecordExporter{
        exporterName=otlp, 
        type=log, 
        endpoint=http://localhost:4317, 
        endpointPath=/opentelemetry.proto.collector.logs.v1.LogsService/Export, 
        timeoutNanos=30000000000, 
        connectTimeoutNanos=10000000000, 
        compressorEncoding=null, 
        headers=Headers{User-Agent=OBFUSCATED}, 
        memoryMode=IMMUTABLE_DATA}, 
        TestLogRecordData{
            resource=Resource{
                schemaUrl=null, 
                attributes={host.arch="aarch64", 
                host.name="cyrille-le-clerc-macbook.local", 
                jenkins.computer.name="jnlp_agent", 
                jenkins.opentelemetry.plugin.version="3.1383.v32c9f94458e3", 
                jenkins.url="http://localhost:8080/", 
                jenkins.version="2.462.2", 
                os.description="Mac OS X 14.6.1", 
                os.type="darwin", 
                process.command_line="/path/to/java --add-opens=java.base/java.io=ALL-UNNAMED agent.jar -url http://localhost:8080/ -secret *** -name jnlp_agent -workDir ~/jenkins/agent-for-lts-jenkins", 
                process.executable.path="/path/to/java", 
                process.pid=83947, 
                process.runtime.description="Homebrew OpenJDK 64-Bit Server VM 11.0.24+0", 
                process.runtime.name="OpenJDK Runtime Environment", 
                process.runtime.version="11.0.24+0", 
                service.instance.id="be802d0a442bb237465405d618d15f47", 
                service.name="jenkins", 
                service.namespace="jenkins", 
                service.version="2.462.2", 
                telemetry.sdk.language="java", 
                telemetry.sdk.name="opentelemetry", 
                telemetry.sdk.version="1.40.0"}}, 
            instrumentationScopeInfo=InstrumentationScopeInfo{
                name=io.jenkins.opentelemetry.api, 
                version=null, 
                schemaUrl=null, 
                attributes={}}, 
            timestampEpochNanos=1725704661522612000, 
            observedTimestampEpochNanos=0, 
            spanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, 
            spanId=0000000000000000, 
            traceFlags=00, 
            traceState=ArrayBasedTraceState{entries=[]}, 
            remote=false, 
            valid=false}, 
            severity=INFO, 
            severityText=INFO, 
            body=StringBody{
                asString=Test log record}, 
            attributes={}, 
            totalAttributeCount=0} -
  • Explore your logs database to find the matching log entry.
  • Unfortunately this test method doesn't raise an exception if the OTLP endpoint is unavailable

@BradyShober
Copy link
Author

I attempted this test on an agent and I was able to see the log entry in elastic.

@cyrille-leclerc
Copy link
Contributor

I attempted this test on an agent and I was able to see the log entry in elastic.

Thanks, so you saw this test message emitted by the Jenkins Agent but not some of the logs of the pipeline that executes on this agent, correct?

@BradyShober
Copy link
Author

Yes, I was able to see the test message from the agent. The behavior during pipeline executions is that I am able to see logs from the controller but not from the agent in the Jenkins UI, but I am able to find the logs from the agent in elastic.

@kuisathaverat
Copy link
Contributor

Do you see the build agent logs on builds that are more than 5 minutes finished?

@BradyShober
Copy link
Author

No, the logs do not show up after any period of time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend/elastic Issue specific to the Elastic backend bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants