Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameters are not set for jobs restarted after VMs are preempted #310

Open
mlazowik opened this issue Feb 28, 2022 · 3 comments
Open

Parameters are not set for jobs restarted after VMs are preempted #310

mlazowik opened this issue Feb 28, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@mlazowik
Copy link

Jenkins and plugins versions report

Environment
Jenkins: 2.319.3
OS: Linux - 4.18.14
---
ace-editor:1.1
ansicolor:1.0.1
ant:1.13
antisamy-markup-formatter:2.5
apache-httpcomponents-client-4-api:4.5.13-1.0
authentication-tokens:1.4
blueocean:1.25.2
blueocean-autofavorite:1.2.4
blueocean-bitbucket-pipeline:1.25.2
blueocean-commons:1.25.2
blueocean-config:1.25.2
blueocean-core-js:1.25.2
blueocean-dashboard:1.25.2
blueocean-display-url:2.4.1
blueocean-events:1.25.2
blueocean-git-pipeline:1.25.2
blueocean-github-pipeline:1.25.2
blueocean-i18n:1.25.2
blueocean-jira:1.25.2
blueocean-jwt:1.25.2
blueocean-personalization:1.25.2
blueocean-pipeline-api-impl:1.25.2
blueocean-pipeline-editor:1.25.2
blueocean-pipeline-scm-api:1.25.2
blueocean-rest:1.25.2
blueocean-rest-impl:1.25.2
blueocean-web:1.25.2
bootstrap4-api:4.6.0-3
bootstrap5-api:5.1.3-3
bouncycastle-api:2.25
branch-api:2.7.0
build-name-setter:2.2.0
caffeine-api:2.9.2-29.v717aac953ff3
checks-api:1.7.2
cloudbees-bitbucket-branch-source:734.v2f848c5e6ea2
cloudbees-folder:6.16
command-launcher:1.6
conditional-buildstep:1.4.1
credentials:1055.v1346ba467ba1
credentials-binding:1.27
dark-theme:155.v497c78bbdbb3
display-url-api:2.3.5
docker-commons:1.17
docker-workflow:1.26
durable-task:493.v195aefbb0ff2
echarts-api:5.2.2-1
envinject:2.4.0
envinject-api:1.8
extended-read-permission:3.2
external-monitor-job:1.7
favorite:2.3.3
filesystem_scm:2.1
font-awesome-api:5.15.4-4
gerrit-trigger:2.35.2
git:4.10.1
git-client:3.10.0
git-server:1.10
github:1.34.1
github-api:1.301-378.v9807bd746da5
github-branch-source:2.11.3
github-organization-folder:1.6
google-compute-engine:4.3.8
google-oauth-plugin:1.0.6
google-play-android-publisher:4.2
greenballs:1.15.1
handlebars:3.0.8
handy-uri-templates-2-api:2.1.8-1.0
htmlpublisher:1.28
icon-shim:3.0.0
jackson2-api:2.13.0-230.v59243c64b0a5
javadoc:1.6
jaxb:2.3.0.1
jdk-tool:1.5
jenkins-design-language:1.25.2
jira:3.6
jjwt-api:0.11.2-9.c8b45b8bb173
job-dsl:1.77
jquery:1.12.4-1
jquery-detached:1.2.1
jquery-ui:1.0.2
jquery3-api:3.6.0-2
jsch:0.1.55.2
junit:1.53
ldap:2.7
locale:1.4
lockable-resources:2.12
mailer:1.34
mapdb-api:1.0.9.0
matrix-auth:2.6.7
matrix-project:1.19
maven-plugin:3.15.1
mercurial:2.16
metrics:4.0.2.8
momentjs:1.1.1
oauth-credentials:0.5
okhttp-api:4.9.3-105.vb96869f8ac3a
pam-auth:1.6.1
pipeline-build-step:2.15
pipeline-github-lib:1.0
pipeline-graph-analysis:1.12
pipeline-input-step:427.va6441fa17010
pipeline-milestone-step:1.3.2
pipeline-model-api:1.9.3
pipeline-model-declarative-agent:1.1.1
pipeline-model-definition:1.9.3
pipeline-model-extensions:1.9.3
pipeline-rest-api:2.19
pipeline-stage-step:2.5
pipeline-stage-tags-metadata:1.9.3
pipeline-stage-view:2.19
pipeline-utility-steps:2.11.0
plain-credentials:1.7
plugin-usage-plugin:2.1
plugin-util-api:2.7.0
popper-api:1.16.1-2
popper2-api:2.10.2-1
prometheus:2.0.10
publish-over:0.22
pubsub-light:1.16
rebuild:1.32
resource-disposer:0.16
reverse-proxy-auth-plugin:1.7.1
role-strategy:3.2.0
run-condition:1.5
scm-api:2.6.5
script-security:1118.vba21ca2e3286
slack:2.49
slave-status:1.6
snakeyaml-api:1.29.1
sse-gateway:1.24
ssh-agent:1.23
ssh-credentials:1.19
ssh-slaves:1.33.0
sshd:3.1.0
structs:308.v852b473a2b8c
theme-manager:0.6
throttle-concurrents:2.5
timestamper:1.15
token-macro:267.vcdaea6462991
translation:1.16
trilead-api:1.0.13
variant:1.4
windows-slaves:1.8
workflow-aggregator:2.6
workflow-api:1108.v57edf648f5d4
workflow-basic-steps:2.24
workflow-cps:2648.va9433432b33c
workflow-cps-global-lib:552.vd9cc05b8a2e1
workflow-durable-task-step:1107.v5dab75aaccbd
workflow-job:1145.v7f2433caa07f
workflow-multibranch:2.26
workflow-scm-step:2.13
workflow-step-api:615.vb09dac339255
workflow-support:804.vba10a18a1476
ws-cleanup:0.39

What Operating System are you using (both controller, and any agents involved in the problem)?

linux

Reproduction steps

  1. Configure a job that has parameters that are required for it to work
  2. Set it to run in preemptive VMs
  3. Wait for a VM to get preempted

Expected Results

The parameters used by the 1st run are passed to the 2nd run

Actual Results

Parameters are missing

Anything else?

This was mentioned as one of the missing features of the restarts v1 PR: #33

Sort of related: #67, #214

@jfr06200
Copy link

jfr06200 commented Aug 7, 2023

+1
This issue really needs to be tackled.

Here is the situation :

  1. Preempted jobs are automatically restarted, but without any parameters.
  2. You cannot disable the restart of preempted jobs. (as mentionned here : No option to disable automatic build restart for preemtible VMs #214)

Conclusion: If you jobs are parameterized, then the preemptible VMs feature is just unusable..

@gbhat618
Copy link
Contributor

gbhat618 commented Jan 5, 2025

Currently the plugin automatically reschedules the build (probably only applies to freestyle, and may not be working for pipelines) if the agent is preempted in the middle of the build. This seems like an incorrect design decision as the plugin doesn't know whether the build/specific steps are idempotent or not.

The other cloud plugins such as kubernetes plugin or ec2 plugin do not do such automatic build rescheduling.
(example https://github.com/jenkinsci/kubernetes-plugin?tab=readme-ov-file#retrying-after-infrastructure-outages)

Each build pipeline need to handle the agent getting removed in the middle of the build using the timeout and retry, either the entire pipeline or a specific step should be wrapped in the retry and timeout with choosing appropriate values as per the specific pipeline.

An example

retry(count: 2, conditions: [agent(), nonresumable()]) {
    timeout (time: 130, activity:true, unit: 'SECONDS') {
        node('gcloud') {
            sh 'date'
            echo "sleeping for 2 minutes"
            sh 'sleep 120'
            sh 'date'
            echo 'sleep done'
        }
    }
}

The plugin shouldn't treat the preemptible VM as a separate case, but rather treat it as similar to any agent outages.

@gbhat618
Copy link
Contributor

gbhat618 commented Jan 5, 2025

In addition to that the code doesn't always work as well, it depends of some network errors and timing as well.
During some testing, when the VM was preempted, the code hit the exception in here, which ended being detected as non-preempted and build being stuck

public boolean getPreempted() {
try {
return preemptedFuture != null && preemptedFuture.isDone() && preemptedFuture.get();
} catch (InterruptedException | ExecutionException e) {
log.log(Level.WARNING, "Error when getting preempted status", e);
return false;
}
}

Perhaps it is better to just remove the Preemptible VM handling differently and rather simply failing the build and let the pipeline author decided on retrying or timeout requirements in entire pipeline / or specific steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants