Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arcade-validation fails when multiple asset name+version instances involved #1964

Closed
dotnet-eng-status bot opened this issue Feb 6, 2024 · 17 comments · Fixed by dotnet/arcade-validation#4183
Assignees

Comments

@dotnet-eng-status
Copy link

Build #20240206.4 failed

❌ : internal / dotnet-arcade-validation-official failed

Summary

Finished - Tue, 06 Feb 2024 21:33:06 GMT
Duration - 72 minutes
Requested for - DotNet Bot
Reason - batchedCI

Details

Validate Publishing

  • ❌ - [Log] - PowerShell exited with code '1'.

Changes

@garath
Copy link
Member

garath commented Feb 6, 2024

From the logs:

Running 'git checkout -b val/arcade-9.0.0-beta.24106.3 System.Object[]'

Looks like a bad .ToString()?

@garath garath self-assigned this Feb 6, 2024
@garath
Copy link
Member

garath commented Feb 6, 2024

FYI, @missymessa and @riarenas, this might be something that needs investigation by your epic.

It looks like, for some reason, @missymessa's 1ES PT test builds are being included with regular Arcade builds and triggering the validation pipeline. The validation pipeline sees two BAR IDs which the script cannot handle (the hash arrives as an array of two elements instead of a regular integer). In this case, the IDs are 211531 and 211527, which are published from builds 2371453 and 2371408 respectively.

I don't know why these are getting mixed, but it seems strictly related to the 1ES PT testing that's happening, so I do not think there is any FR action to take. Validation is still happening for other builds.

@garath garath removed their assignment Feb 6, 2024
@missymessa
Copy link
Member

I'm glad it's failing, if that's any consolation :D

@garath
Copy link
Member

garath commented Feb 6, 2024

If y'all think this is just temporary while 1ES PT is sorted, I'm fine closing this and future similar issues. Otherwise handle as you see fit.

@missymessa
Copy link
Member

From testing the 1ES PTs. Closing.

@riarenas
Copy link
Member

riarenas commented Feb 6, 2024

But the arcade-validation official pipeline is blocked right? so we can't do promotions? I don't think we can just close this. I think we should find out why we're picking up builds that are not in the channel that we expect.

@riarenas riarenas reopened this Feb 6, 2024
@riarenas
Copy link
Member

riarenas commented Feb 6, 2024

Bar build 211527 is a good build that we should be validating against, as that's the build that opened the dependency update PR and was merged and then ran the official build.

Bar build 211531 is a bad build that we should not be validating against, this build is not in any channels, so it shouldn't be getting picked up by anything. That would mean there's a bug somewhere in the publishing validation if we're picking up dev builds for the official validation.

@missymessa
Copy link
Member

I don't think it's blocked. Looks like it picked up some other builds I tried earlier and when the arcade-official-ci pipeline produced newer packages, Arcade Validation was successful.

@riarenas
Copy link
Member

riarenas commented Feb 6, 2024

This is the latest run of the pipeline: https://dev.azure.com/dnceng/internal/_build/results?buildId=2371548&view=results. it's this build. It failed during the validation job, so it didn't promote the arcade build to the latest channel. I queued a retry so hopefully it's indeed unblocked, but I'm not seeing any successful existing builds.

@riarenas
Copy link
Member

riarenas commented Feb 6, 2024

There's nothing special about the incorrect build that is being picked up, so I don't think this is related to the templates effort, this bug seems to be that our scripts are not choosing the correct builds to validate against.

darc get-build --id 211531
Repository:    https://dev.azure.com/dnceng/internal/_git/dotnet-arcade
Branch:        refs/heads/missymessa-template-testing
Commit:        2e00b07782a93ae53358d23200e3113e4d1f50a4
Build Number:  20240206.3
Date Produced: 2/6/2024 11:26 AM
Build Link:    https://dev.azure.com/dnceng/internal/_build/results?buildId=2371453
AzDO Build Id: 2371453
BAR Build Id:  211531
Released:      False
Channels:

Shows that this build is not in the validation channel, and we should only validate against builds from the validation channel, such as

darc get-build --id 211527
Repository:    https://github.com/dotnet/arcade
Branch:        main
Commit:        fe491dfefad0b8e2da73f395b2b8d9cf72a54c9e
Build Number:  20240206.3
Date Produced: 2/6/2024 10:38 AM
Build Link:    https://dev.azure.com/dnceng/internal/_build/results?buildId=2371408
AzDO Build Id: 2371408
BAR Build Id:  211527
Released:      False
Channels:
- .NET Eng - Validation

Which is in the validation channel

@riarenas
Copy link
Member

riarenas commented Feb 6, 2024

Second attempt of this build failed: #1965

I'm attempting a clean build of main as a last resort, if that fails, Arcade promotions are currently blocked on this. https://dev.azure.com/dnceng/internal/_build/results?buildId=2371776&view=results

@riarenas
Copy link
Member

riarenas commented Feb 6, 2024

ahhh, @garath showed me these older builds that are failing with the same symptom: https://dev.azure.com/dnceng/internal/_build/results?buildId=2368985&view=results and @missymessa is totally right that we've had other successful builds since.

If running a clean build works, or even if we understand what triggers the problem so we can avoid it when we really need to promote, this is less of an FR issue, and just becomes a bug to file and eventually squash.

If we don't have a clear mechanism to unblock the pipeline, and Arcade promotions becomes blocked, that's when I think FR should dig into this job and the darc queries it's using to determine which builds to pick up.

@garath
Copy link
Member

garath commented Feb 7, 2024

Manual retry failed, so arcade releases are at risk of being blocked. Taking back for FR.

@garath
Copy link
Member

garath commented Feb 7, 2024

I think I see what's happening. The arcade-validation code is not precisely defining the BAR coordinates for the asset being validated.

  1. The 1ES PT builds are publishing an artifact with the same name and version that is being validated in arcade-validation .
  2. None of the calls to darc in arcade-validation specify a channel, only the name and version of the asset.
  3. Thus instead of getting details for a single asset, it's finding two.

The solution is to specify the ".NET Eng - Validation" as the source for asset discovery.

I'll put together a PR and run tests.

@garath
Copy link
Member

garath commented Feb 7, 2024

In case I close the terminal, here's an example of the command and result that the arcade-validation script is seeing:

PS arcade-validation> darc get-asset --name "Microsoft.DotNet.Arcade.Sdk" --version "9.0.0-beta.24106.3"
Looking up assets with name 'Microsoft.DotNet.Arcade.Sdk' and version '9.0.0-beta.24106.3' in the last 30 days
Microsoft.DotNet.Arcade.Sdk @ 9.0.0-beta.24106.3
Repository:    https://dev.azure.com/dnceng/internal/_git/dotnet-arcade
Branch:        refs/heads/missymessa-template-testing
Commit:        2e00b07782a93ae53358d23200e3113e4d1f50a4
Build Number:  20240206.3
Date Produced: 2/6/2024 11:26 AM
Build Link:    https://dev.azure.com/dnceng/internal/_build/results?buildId=2371453
AzDO Build Id: 2371453
BAR Build Id:  211531
Released:      False
Channels:
Locations:
- https://dev.azure.com/dnceng/internal/_apis/build/builds/2371453/artifacts (Container)

Microsoft.DotNet.Arcade.Sdk @ 9.0.0-beta.24106.3
Repository:    https://github.com/dotnet/arcade
Branch:        main
Commit:        fe491dfefad0b8e2da73f395b2b8d9cf72a54c9e
Build Number:  20240206.3
Date Produced: 2/6/2024 10:38 AM
Build Link:    https://dev.azure.com/dnceng/internal/_build/results?buildId=2371408
AzDO Build Id: 2371408
BAR Build Id:  211527
Released:      False
Channels:
- .NET Eng - Validation
Locations:
- https://dev.azure.com/dnceng/internal/_apis/build/builds/2371408/artifacts (Container)
- https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-eng/nuget/v3/index.json (NugetFeed)

@garath garath changed the title Build failed: dotnet-arcade-validation-official/main #20240206.4 arcade-validation fails when multiple asset name+version instances involved Feb 7, 2024
@riarenas
Copy link
Member

riarenas commented Feb 7, 2024

OOOOH, this is only an issue because we are using a different pipeline to produce the same assets, so there's a conflict in versions that would never happen in ordinary circumstances, as the version is partially determined by the build number.

The change to filter by channel is a good change, even if this scenario is something that usually shouldn't happen. Good catch.

@missymessa
Copy link
Member

I'll be deleting the pipeline I created for testing the 1ES PT stuff when I'm done with it, but this would be a good change in case it happens in the future :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants