Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins library to generate buildInfo.yml #2473

Merged
merged 6 commits into from
Aug 25, 2022

Conversation

zelinh
Copy link
Member

@zelinh zelinh commented Aug 17, 2022

Signed-off-by: Zelin Hao [email protected]

Description

Add the new jenkins library to generate buildInfo.yml used for our self-service dashboards.

Issues Resolved

#2410

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@zelinh zelinh self-assigned this Aug 17, 2022
@zelinh zelinh requested a review from a team as a code owner August 17, 2022 18:29
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels very verbose and complicated.

From what I understand you're trying to save status of the build. So maybe the library function should be status, and it should stash/unstash a build.yml, then update it? I also believe things like "current stage" can be obtained from ENV and may not need to be passed around, which should cut the logic here by half.

buildInfoYaml(
componentName: COMPONENT_NAME,
inputManifest: "manifests/$INPUT_MANIFEST",
outputFile: "buildInfo.yml",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this to something not containing camelCase, we will land on case-insensitive file systems. Maybe just build.yml?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking the main purpose of this yaml file is to provide the build information (e.g. each stage status, start time & duration, the job ids of triggering tests) from each workflow. build.yml maybe a little confusing with the build manifest. WDYT @dblock

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this all already provided by exporting a jenkins job history?
https://stackoverflow.com/questions/52866480/export-build-history-of-jenkins-job

unstash "buildInfo_manifest"
buildInfoYaml(
componentName: COMPONENT_NAME,
inputManifest: "buildInfo.yml",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input = output?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblock This is actually taking buildInfo.yml as input since we initialize it in the previous stage.

inputManifest.results.duration = currentBuild.duration
inputManifest.results.startTimestamp = currentBuild.startTimeInMillis
}
writeYaml(file: outputFile, data: inputManifest, overwrite: true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This set of ifs feels very hard-coded. Use args.stage to identify the key that needs to be updated and set it to status.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to refractor these sets to make it more straightforward.

@zelinh
Copy link
Member Author

zelinh commented Aug 17, 2022

I also believe things like "current stage" can be obtained from ENV and may not need to be passed around, which should cut the logic here by half

The stage parameter passing into the function will also be used as the name variable in the buildInfo.yml file. That why I wish to name it instead of just using the stage name from the distribution workflow since a lot of them are more like a sentence.

@zelinh
Copy link
Member Author

zelinh commented Aug 17, 2022

Had some offline discussion with @dblock , I will be make following changes:

  • job.yml will be more related to what this final result yaml file contains and will replace my current name BuildInfo.yml.
  • Current I checkout the commits for each components when I initialized the yaml file, this will be wrong if any plugin commits anything into their repo. We should be getting these commit IDs from the assemble manifest (e.g. this sample assemble manifest).
  • Current buildInfoYaml jenkins method contains too much hardcoded structure and could be improved.
  • Future enhancements could be made to these entire workflow similar to https://github.com/opensearch-project/opensearch-build/blob/main/src/jenkins/Messages.groovy

@codecov-commenter
Copy link

codecov-commenter commented Aug 18, 2022

Codecov Report

Merging #2473 (8aac754) into main (856b691) will increase coverage by 0.14%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##               main    #2473      +/-   ##
============================================
+ Coverage     94.63%   94.77%   +0.14%     
============================================
  Files           211      162      -49     
  Lines          4322     4269      -53     
  Branches         29       19      -10     
============================================
- Hits           4090     4046      -44     
+ Misses          226      223       -3     
+ Partials          6        0       -6     
Impacted Files Coverage Δ
...ests/jenkins/jobs/AssembleManifest_tar_Jenkinsfile
...jobs/PromoteArtifactsQualifier_actions_Jenkinsfile
tests/jenkins/jobs/DownloadFromS3_Jenkinsfile
...jenkins/jobs/PromoteArtifactsQualifier_Jenkinsfile
tests/jenkins/jobs/Messages_Jenkinsfile
tests/jenkins/jobs/InputManifest_Jenkinsfile
...tifactsQualifier_OpenSearch_Dashboards_Jenkinsfile
src/jenkins/BundleManifest.groovy
src/jenkins/InputManifest.groovy
tests/jenkins/jobs/BuildYumRepo_Jenkinsfile
... and 49 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Zelin Hao <[email protected]>
Signed-off-by: Zelin Hao <[email protected]>
}
echo (componentsList.toString())

if (args.stage == "START") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start and complete should just be separate functions. Like buildInfoCreate, buildInfoUpdate and buildInfoUpload.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After I discussed with @peterzhuamazon, we cut the the scope of this PR to exclude the status for individual stages. I think having separate groovy function can make it more complicate and hard to track.

Signed-off-by: Zelin Hao <[email protected]>
Comment on lines +39 to +46
dir(component.name) {
checkout([$class: 'GitSCM', branches: [[name: component.ref]],
userRemoteConfigs: [[url: component.repository]]])
def commitID = sh(
script: "git rev-parse HEAD",
returnStdout: true
).trim()
component.ref = commitID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should put this to the start and the complete stage.
If failed, we still have a commitid to check.
If complete, update the commitid again just in case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, i realized this part is in the post always stage. This would be executed no matter the job fails or succeeds.

Signed-off-by: Zelin Hao <[email protected]>
dblock
dblock previously requested changes Aug 24, 2022
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finalize stage produces wrong commit IDs, so I think that warrants a must have.

}
echo (componentsList.toString())

if (args.stage == "INITIALIZE_STAGE") {
Copy link
Member

@dblock dblock Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still find this big state machine quite problematic. I won't block this PR for this, but I think the biggest bug here is that your workflow will silently produce an incorrect YML if you forget to call the function with INITIALIZE_STAGE.

I would extract the initialization into its own function in which you do not unstash an existing build info and remove the catch in unstashing it - that should never happen unless there's a bug in the workflow.

sourceyml.results.startTimestamp = currentBuild.startTimeInMillis
} else if (args.stage == "FINALIZE_STAGE") {
sourceyml.components.each { component ->
if (componentsList.contains(component.name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value obtained here will be wrong. Let's say your ref is 2.x, it's almost certain that there will be a new commit between the time you build and the time you run this code. We already produce the commit ID in build output manifest (e.g. here), use that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually what i was thinking is this build manifest or bundle manifest would only output if the build or assemble succeed. If the building stages failed before, we won't have this certain manifest. The scope of this PR is to display the head commit of each components no matter the job succeed or fail and the commit ID it contains will serve for informative purpose in the dashboard.

Copy link
Member

@dblock dblock Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use --lock, https://github.com/opensearch-project/opensearch-build/tree/main/src/build_workflow#avoiding-rebuilds.

./build.sh --lock $MANIFEST # generates opensearch-1.2.0.yml.lock

The refs in this manifest are commit IDs instead of any refs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after i going through, this looks like different than what i expect.

  • running ./build.sh --lock manifests/2.2.0/opensearch-2.2.0.yml will not actually build & generate the lock file. it would only generate the lock file and return.
    if args.ref_manifest:
    manifest = manifest.stable()
    if os.path.exists(args.ref_manifest):
    if manifest == InputManifest.from_path(args.ref_manifest):
    logging.info(f"No changes since {args.ref_manifest}")
    else:
    logging.info(f"Updating {args.ref_manifest}")
    manifest.to_file(args.ref_manifest)
    else:
    logging.info(f"Creating {args.ref_manifest}")
    manifest.to_file(args.ref_manifest)
    return 0
  • If we then adding this line separate from the actual process of build & assemble , its logic is pretty much the same as what I was doing in my PR:
    def __stabilize__(self) -> None:
    ref, name = GitRepository.stable_ref(self.repository, self.ref)
    logging.info(f"Updating ref for {self.repository} from {self.ref} to {ref} ({name})")
    self.ref = ref
    and still have the same concern of wrong data if people push when building.
  • I think the best approach would be adding this lock process into the actual building process instead of simply returning after generating the lock file. While I think that’s out of the scope of this PR. I could raise an issue to for requesting this feature. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that’s better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblock Are you ok with merging this PR and opening a new issue to address the gaps with future PR? Currently this is created to address a POC but we will definitely enhance it further when we productionize it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with merging this PR. @zelinh please open an issue describing your suggestion of adding the lock process into the actual building process, and consuming the stable manifest instead of this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dblock Yes. Sure I will create an issue for enhancement feature request. We will be working on the improvement. Thanks.

@dblock dblock dismissed their stale review August 25, 2022 00:58

I think it’s unfair for me to ask to rewrite this whole PR.

@zelinh zelinh merged commit c67bdf9 into opensearch-project:main Aug 25, 2022
@zelinh zelinh deleted the buildinfo-yaml branch August 25, 2022 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants