Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prerelease 0.21 cannot install any models #85

Closed
6 tasks done
CodeGat opened this issue May 24, 2024 · 10 comments
Closed
6 tasks done

Prerelease 0.21 cannot install any models #85

CodeGat opened this issue May 24, 2024 · 10 comments

Comments

@CodeGat
Copy link
Member

CodeGat commented May 24, 2024

See https://github.com/ACCESS-NRI/ACCESS-OM2/actions/runs/9203230162/job/25355101630#step:7:91, https://github.com/ACCESS-NRI/ACCESS-OM2/actions/runs/9216320542/job/25356375475

When deploying onto Prerelease 0.21, we get errors relating to the initial checking of the spack install itself:

'/bin/git' '-c' 'advice.detachedHead=false' 'fetch' '--tags'
ProcessError: Command exited with status 1:

@aidanheerdegen had mentioned that this was an issue when we were --single-branch cloning our initial spack environments, but git status shows that we are currently on the releases/0.21 branch, not in a Detached HEAD state.

Verified that this happens for both existing and new PRs of models, so this issue holds up deployments of models more generally. It was working earlier that day (23/05/2024), see ACCESS-NRI/ACCESS-ESM1.5#5 (comment).

Things to do

  • Verify that the spack config blame configs seems alright
  • Verify that ACCESS-NRI/spack hasn't vanished
  • Verify that .git seems okay in spack
  • 😭
  • Determine if it is fetching the tags of something else - not necessarily the spack install. Maybe it is actually fetching the tags of access-om2, perhaps?
  • figure out if the new SSH key on tm70_ci is to blame

If all else fails:

  • Delete prerelease environment, intialise prerelease environment, redeploy all open model PRs
@CodeGat
Copy link
Member Author

CodeGat commented May 24, 2024

When doing a git log I note only one commits shows up, and it is grafted:

[tm70_ci@gadi-login-05 spack]$ git log
commit f17445a582bb076993f88fef0e0e121a2062a816 (grafted, HEAD -> releases/v0.21, origin/releases/v0.21)
Author: Harshula Jayasuriya <[email protected]>
Date:   Wed Apr 17 16:15:49 2024 +1000

    fms: add large_file and internal_file_nml variants
    
    * Micael will submit this patch upstream

When I run git tags --fetch on spack itself, it runs fine and has a 0 exit code:

[tm70_ci@gadi-login-05 spack]$ git fetch --tags --verbose
POST git-upload-pack (472 bytes)
From https://github.com/access-nri/spack
 = [up to date]        releases/v0.21 -> origin/releases/v0.21

@CodeGat
Copy link
Member Author

CodeGat commented May 24, 2024

Hmmm...I wonder if the newly-added ssh key to tm70_ci is being used in the git commands under the hood? Even if so, access-bot has access to ACCESS-NRI/spack (it is public, after all) and should be able to fetch tags.

@aidanheerdegen
Copy link
Member

Is there a permissions issue on the .git folder? Has a user altered it unknowingly making it impossible for the service user to make subsequent changes?

@CodeGat
Copy link
Member Author

CodeGat commented May 24, 2024

I'm logged in as the service user and can do the commands fine. Checking the ACLs and they look fine.

@CodeGat
Copy link
Member Author

CodeGat commented May 24, 2024

I've done a git fetch --unshallow on the prerelease 0.21 to make the commits in the repo ungrafted. Will see if this has any effect on the deployment. EDIT: No dice - https://github.com/ACCESS-NRI/ACCESS-OM2/actions/runs/9203230162/job/25358081104

@aidanheerdegen
Copy link
Member

Is there any significance to the git path being /bin/git?

$ /bin/git --version
git version 2.39.3

I just tried it on another repo, and that git seems happy enough with the syntax

$ /bin/git -c advice.detachedHead=false fetch --tags
remote: Enumerating objects: 21, done.
remote: Counting objects: 100% (18/18), done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 21 (delta 9), reused 17 (delta 9), pack-reused 3
Unpacking objects: 100% (21/21), 5.61 KiB | 20.00 KiB/s, done.

@CodeGat
Copy link
Member Author

CodeGat commented May 24, 2024

I set GIT_TRACE and GIT_CURL_TRACE on the service user, and can't get any useful output before it exits: https://github.com/ACCESS-NRI/ACCESS-OM2/actions/runs/9203230162/job/25359259753#step:7:352
But on earlier successful steps, I get a fair bit (this is for fetching spack-packages: https://github.com/ACCESS-NRI/ACCESS-OM2/actions/runs/9203230162/job/25359259753#step:7:30

I'm not certain if there is significance regarding the git version.

I'll try and see if there are any environment variables I can set on the service user so I can generate some proper spack logs...annoying that we can't even see what it is fetching...

@CodeGat
Copy link
Member Author

CodeGat commented May 24, 2024

Found the cause of it.
The local copy of ACCESS-OM2 in spack is git fetch --tags from the remote, and failing due to potentially clobbering tags. This is because of our 'rolling tag' logic in prereleases, in which the latest commit in the model PR is tagged with the final version of the model SBD (for example, [email protected] means that the PR will have a tag on the HEAD of the branch of 2024.04.21).

To fix this:

  • Permanently: Remove the rolling tag logic from prereleases. Only tag the first commit on the PR, and when the PR is merged, the merge commit.
  • Unblock ACCESS-OM2 Releases: Find the ACCESS-OM2 repo in the spack install and do a git fetch --tags --force to update the tags one last time.

@CodeGat
Copy link
Member Author

CodeGat commented May 24, 2024

Redeploying as the second dot point broke the installation.

@CodeGat
Copy link
Member Author

CodeGat commented May 27, 2024

Fixed! Reinstalled the prerelease environment from scratch and it now works (although there is no longer the old environments).

@CodeGat CodeGat closed this as completed May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants