Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groovyw module update-all improvements #4009

Merged
merged 18 commits into from
Jun 10, 2020

Conversation

cvennel
Copy link
Contributor

@cvennel cvennel commented May 31, 2020

Contains

Fixes #3661 Parts 1 and 2 (now fixes all three parts)

Notes:

  • I would have rather done a full diff of everything that changed in an update instead of listing the file changes per commit, but grgit doesn't seem to have the diff capability on anything other than single commits until this PR is merged Fixes ajoberstar/grgit#167 ajoberstar/grgit#318

  • I'm not fully sold that skipping an update is the best approach if a module has recently been updated. Running a few extra git-fetches in the event an update needs to be re-run isn't that expensive in the grand scheme, and personally I'd rather be able to ensure I have the latest by re-running the pull. Maybe some sort of "force update" option would be a good compromise here? So skip updating the module unless something like -f was passed?

  • I didn't understand Part 3 of the issue

How to test

Skip updating module if recently updated:

cd modules/<module name>
git pull
cd ../..
groovyw module update-all

Simulate updating with 5 commits:

cd modules/<module name>
git reset --hard HEAD~5
cd ../..
groovyw module update-all

Confirm subdirectories in .gitignore do not block parent directory

in .gitignore change .settings/ to .settings/hide
Confirm that .settings is no longer ignored in groovyw module update-all

Sample output

In the sample screenshot we see:

  • .Settings is ignored due to being in the .gitignore
  • CakeLie already up to date
  • CoreSampleGameplay recently updated within 2 minutes, so skip
  • Health pulls 5 commits to update
  • ModuleTestingEnvironment already up to date

(Image updated with .Settings skipped to honor .gitignore)
update-all-output

@skaldarnar
Copy link
Member

Thanks @cvennel - the sample output looks neat. I'll get round reviewing this later this eveneing (hopefully 🙈 ). Here are my initial 2 cents on the topic:

Part 3 refers to the fact that the script is blindly trying to update the "module" .settings even though the folder is in .gitignored.

I'm not sure what the benefit of skipping should be in any case - I hope we are not optimizing for bandwidth, and pointlessly waiting for two minutes or doing the commands manually seems silly. I thought that git pull (doing fetch and merge) is already kind of smart to check that no updates are available.

@skaldarnar skaldarnar self-requested a review May 31, 2020 14:13
@skaldarnar skaldarnar added Type: Improvement Request for or addition/enhancement of a feature Category: Build/CI Requests, Issues and Changes targeting gradle, groovy, Jenkins, etc. labels May 31, 2020
@cvennel
Copy link
Contributor Author

cvennel commented May 31, 2020

Hi @skaldarnar. Thanks for the clarification on part 3.

Regarding part 2, I agree with you, I'm not seeing the benefit of skipping as described in part 2, but it was in the issue so I wrote it up :) I have it as a separate commit to make it easy to back out if that's what we end up deciding to do.

I'll add some more to handle part 3.

@cvennel
Copy link
Contributor Author

cvennel commented May 31, 2020

Ok, now the .setting module should be ignored since it's in the .gitignore.
We also want to make sure subdirectories in the .gitignore don't block the parent directory. I'll add more details to the description and how to test

Comment on lines 191 to 204
def searchString = itemName
if (itemName.startsWith(".")){
// add literal slash for regex start with '.'
searchString = "\\$itemName"
}
def inGitIgnore = false
new File(".gitignore").eachLine{ line ->
// match if line is exactly same os itemName or has trailing '/' but
// not if has further subdirectories
if ((line ==~ searchString) || (line ==~ "$searchString/")){
inGitIgnore = true
return
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be extracted to its own method for better readability.

Doing so will also make it easier to spot bugs or find the right location for doing improvements. We're missing that .gitignore files can also appear in subfolders and "add up" to each other. Not a huge deal, but ideally we could use some library to do this work for us...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this now checks every .gitignore. I also added some output to tell you which .gitignore is blocking the module. See sample below:
update-all-output-all-gitignores

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correclty, modules/foo will be ignore now if mentioned in sandbox/.gitignore? I don't think that's how the ignored files are computed 🧐

Some references:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not used to multiple .gitignores. I had a section computing the relative path but realized I made a mistake and was finding the relative path to the root dir instead of the modules dir. Then the build failed after backing out the relative path stuff which surprised me. Wanted to get more info on that before continuing the relative path

config/groovy/common.groovy Outdated Show resolved Hide resolved
@cvennel cvennel marked this pull request as draft June 1, 2020 22:27
@cvennel cvennel marked this pull request as ready for review June 2, 2020 01:13
@cvennel
Copy link
Contributor Author

cvennel commented Jun 2, 2020

@skaldarnar, do you mind telling me what check failed?

@skaldarnar
Copy link
Member

The analytics step failed in our Jenkins pipeline:

image

It failed with

FAILURE: Build failed with an exception.

* What went wrong:
Gradle build daemon disappeared unexpectedly (it may have been killed or may have crashed)

I'm restarting the build.

@cvennel cvennel changed the title groovyw module update-all improvements WIP: groovyw module update-all improvements Jun 2, 2020
@keturn
Copy link
Member

keturn commented Jun 2, 2020

I haven't been following this, just noticing bits as they come across the activity feed, but maybe https://git-scm.com/docs/git-check-ignore is interesting to you?

@cvennel
Copy link
Contributor Author

cvennel commented Jun 2, 2020

I haven't been following this, just noticing bits as they come across the activity feed, but maybe https://git-scm.com/docs/git-check-ignore is interesting to you?

@keturn I hadn't seen that before, thanks for the new info! Unfortunately it looks like this would ignore every module since the main .gitignore has /modules/* in it which can't change otherwise the main repo would get all the modules stuff.

I also don't think the .gitignores below the modules directory should be able to control each other except maybe within .settings (putting ../health in the .gitignore file in the CakeLie module shouldn't allow blocking updates for health)

So I think this is in a weird state of trying to honor the .gitignores, but since the .gitignore would ignore everything we can't trust the .gitignore. Would we be better off creating another settings file to control this like maybe a .moduleignore file in the modules dir?

Using a dedicated module update config isn't as slick, but the only other thing I can think of at the moment is manually parsing each config file looking for an exact match of the relative path to a module in addition to the main .gitignore's /modules/* setting. This feels wrong to me since it would seem redundant to have both /modules/* and /modules/.settings/ in the gitignore when in reality they serve two completely different purposes.

sorry for the long rambling message, kind of unsure which way to take this now.

@Cervator
Copy link
Member

Cervator commented Jun 3, 2020

Hey @cvennel - great work here, really appreciate seeing it and am looking forward to trying it :-)

For the .gitignore thing while it sounds cool to use it in a dynamic fashion, if it is getting too complex would it be an option simply to hardcode ignoring anything not starting with an alphanumeric character? That'll catch all the metadata type directories like .settings and about anything else I could think of. Anything starting with an actual letter (or possibly number) likely should be a module or we're being goofy.

On skipping recently updated modules I could see that maybe being handy in a mega-workspace - if you're trying to update 200 modules and a few fail from being dirty it might be nice to clean them up then just run again (I've hit that occasionally and been reluctant to run the whole thing again). Although who knows if 2 minutes would be enough for that, and trying to make that more advanced would probably again just overcomplicate things.

Edit: Oh! And as a general practice feel free to leave a PR in Draft stage rather than using the old school WIP prefix. Much more visible that way :-)

cvennel added 4 commits June 3, 2020 19:13
Add skip any modules that start with non-alphanumeric character

Set up --continue flag for skiping recently updated modules
@cvennel cvennel marked this pull request as draft June 4, 2020 00:07
@cvennel cvennel changed the title WIP: groovyw module update-all improvements groovyw module update-all improvements Jun 4, 2020
@cvennel
Copy link
Contributor Author

cvennel commented Jun 4, 2020

New updates:

part 1) none

part 2)For the .gitignore section, I took @Cervator's suggestion and hard-coded skipping non-alphanumeric modules since it was just getting too goofy with the .gitignore follow but don't follow problem.

part 3) I made skipping modules an option. By default, we'll let git try and fetch each module and progress as normal. However, if so desired, you can now run groovy update-all --continue which will skip anything that has been updated within 10 minutes, I think this strikes a good balance between various suggestions from @Cervator and @skaldarnar

Normal
groovyw module update-all
update-all


Continue
groovyw module update-all --continue
update-all-continue

@cvennel cvennel closed this Jun 4, 2020
@cvennel cvennel reopened this Jun 4, 2020
@cvennel
Copy link
Contributor Author

cvennel commented Jun 4, 2020

and regarding the WIP prefix, I lost track of the draft button and habits from using Gitea at work kicked in, oops :) thanks for the reminder about draft

@cvennel cvennel marked this pull request as ready for review June 4, 2020 00:26
@cvennel cvennel requested a review from skaldarnar June 4, 2020 00:26
Copy link
Member

@Cervator Cervator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests out for me 👍 Skipped a couple gibberish directories I added with non-alphanum, updated what I reset back a few commits, and reported nicely.

A couple oddities that might just be my environment:

  • On Win10 both on a basic command prompt and inside the IntelliJ terminal the color highlights don't work and I just get some gibberish characters. Not new with this PR, was the same before. I bet if I ask Google there's a way to make it work, but on the other hand I wonder if we can do something to fix it from our side or avoid it if the user's terminal isn't set up to handle the colors? On my Mac on the other hand it looks perfect :-)
  • I tried the --continue flag after first running an update normally, which ended up not actually needing to update anything. On the subsequent run with --continue oddly two of the module (of about a dozen) skipped due to recent updates, while the rest just did the "updating module x" then followed by "No changes found". That's a convenience at best so even if it doesn't work perfectly it doesn't really block merging :-)

Over to you @skaldarnar 👋

@cvennel
Copy link
Contributor Author

cvennel commented Jun 4, 2020

  • I tried the --continue flag after first running an update normally, which ended up not actually needing to update anything. On the subsequent run with --continue oddly two of the module (of about a dozen) skipped due to recent updates, while the rest just did the "updating module x" then followed by "No changes found". That's a convenience at best so even if it doesn't work perfectly it doesn't really block merging :-)

The condition will only skip updates if the .git/FETCH_HEAD has been updated, so a git fetch that results in no changes may not change the file and such would not cause that module to be skipped. I think in testing I had to git pull in the module directory because git fetch didn't update the file but I can't fully remember. I bet we could do something simple like just open the FETCH_HEAD while iterating through to change the system time of the last update on the file. That would cause every module you fetched to skip as well.

@cvennel cvennel marked this pull request as draft June 4, 2020 14:55
@@ -215,8 +250,33 @@ class common {
println color("uncommitted changes. Skipping.", Ansi.YELLOW)
} else {
println color("updating $itemType $itemName", Ansi.GREEN)
File targetDirFetchHead = new File("$targetDir/.git/FETCH_HEAD")
def timeLimit = use(groovy.time.TimeCategory){ 10.minute }
if (skipRecentUpdates && isRecentlyUpdated(targetDirFetchHead, timeLimit)){
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized in a newly cloned module the FETCH_HEAD may not exist. I need to add some sort of check to make sure the file exists before running isRecentlyUpdated()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading stuff directly out of .git/ makes me anxious.

I won't say it's never a good idea. In practice, those sorts of files usually have pretty stable formats. But if you can do it through your git client — which in this case looks like is Grgit wrapping JGit — do prefer that. Then they can be the ones responsible for handling whether or not files exist and what settings are in them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually reading the file, just checking the last time the file was modified by git.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still vote for removing this "feature" 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm not married to the concept of that feature personally, I thought it could be a cool bit of utility if it would work out easily enough, but if it is becoming too complex I'm not at all opposed to going without :-)

I just retried it, ran groovyw module update-all and one module updated. Ran groovyw module update-all --continue and all skipped. So it seemed like it worked 👍 Whether it actually is meaningful at this point I'm less sure about.

We had more cases in the past where an update-all in a mega-workspace could hit one module where the remote was gone (say the user had been testing some other users module and it had been deleted on GitHub) which would break the whole update. But I just tried to break a module's Git config and the process caught that gracefully, skipped it, and continued on its own.

Still, that could just be one case, I'm not sure if there might be others. It seems like it works perfectly at this point and it is optional if passing in --continue - maybe it'll still be useful?

@cvennel
Copy link
Contributor Author

cvennel commented Jun 4, 2020

I just added checks to make sure the fetch_head exists and now update the modified time whenever groovyw module update-all is run. This fixes the behavior that @Cervator noted:

On the subsequent run with --continue oddly two of the module (of about a dozen) skipped due to recent updates, while the rest just did the "updating module x" then followed by "No changes found".

@cvennel cvennel marked this pull request as ready for review June 4, 2020 23:32
@@ -117,8 +117,14 @@ switch (cleanerArgs[0]) {
case "update-all":
println "We're updating every $itemType"
println "List of local entries: ${common.retrieveLocalItems()}"
for (item in common.retrieveLocalItems()) {
common.updateItem(item)
if (cleanerArgs.contains("--continue")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some help message defined somewhere (printUsage) - could you please add a note about this flag there?
Would --skipRecentlyUpdated be a better name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original reason I chose --continue is it lines up with some git commands so I thought it would be easier to remember. For example, If a git merge fails due to conflicts, you resolve the conflicts and call git merge --continue. I also did the two dashes convention for full words (-u is short for --set-upstream).

I'm going change this to -skip-recently-updated to closer match the other flags listed in the printUsage.

config/groovy/util.groovy Outdated Show resolved Hide resolved
config/groovy/common.groovy Outdated Show resolved Hide resolved
config/groovy/common.groovy Show resolved Hide resolved
config/groovy/common.groovy Outdated Show resolved Hide resolved
config/groovy/common.groovy Outdated Show resolved Hide resolved
Comment on lines 253 to 261
File targetDirFetchHead = new File("$targetDir/.git/FETCH_HEAD")
def timeLimit = use(groovy.time.TimeCategory){ 10.minute }
if (skipRecentUpdates && targetDirFetchHead.exists() && isRecentlyUpdated(targetDirFetchHead, timeLimit)){
println color("Skipping update for $itemName: updated within last $timeLimit", Ansi.YELLOW)
return
}
if (targetDirFetchHead.exists()){
targetDirFetchHead.setLastModified(new Date().getTime())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should at least be extracted to a method for better readability.

I think this bit clutters our code, and in my opinion for no apparent reason other than "because it is technically possible". We are not using Github API to worry about rate limiting, so I don't understand why we would ever need this feature.

@Cervator I'd rather have 20-30 lines of code less to maintain (and/or go stale)...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the main bit of hesitancy I've got now that my original use case doesn't result in the update halting early. If that still happened in mega-workspaces (I've hit that regularly in the past, having various GCI modules etc) then I'd still go for this for sure. But now, that I'm not sure it'll ever be needed? Conflicted 😕

To be clear the time it takes to chug through a 200 module megaworkspace with an update-all possibly several times (if something breaks) was the motivation, less so the stress on GitHub. Although my digital treehugger self does lose a tiny bit of sleep at night over that as well :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully extract to a new method, but did make it more readable. It was looking to me like extracting wouldn't make it much more readable, let me know what you think

config/groovy/common.groovy Show resolved Hide resolved
@skaldarnar
Copy link
Member

Okay, let's get this merged (see pending comments) with the skipping feature for now.

In addition, bringing @cvennel 's attention to #4035 for more build and logistics magic 😉 Ever wanted to try out Kotlin? 🤓

@cvennel
Copy link
Contributor Author

cvennel commented Jun 8, 2020

I should be able to work on this tonight after work. I had to finish up some stuff for school this weekend . And I like kotlin :) I'll take a look at #4035 too

@cvennel cvennel marked this pull request as draft June 8, 2020 09:38
Note: 'get' + 'recurse' only. This will override an alternativeGithubHome set via gradle.properties.
'-simple-list-format' to print one item per row for the 'list' sub-command, even for large numbers of items
'-condensed-list-format' to group items by starting letter for the 'list' sub-command (default with many items)
'-skip-recently-updated' (Only for update-all) to skip updating modules that have already been updated within 10 minutes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new line. The rest is removing unnecessary println statements in favor of a multi-line string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, I meant line 327

@cvennel cvennel marked this pull request as ready for review June 8, 2020 23:51
@cvennel cvennel requested a review from skaldarnar June 8, 2020 23:51
@skaldarnar skaldarnar merged commit bc4f1a8 into MovingBlocks:develop Jun 10, 2020
@Cervator Cervator added this to the v4.0.0 milestone Jun 11, 2020
@Cervator
Copy link
Member

Thank you very much for the work @cvennel :-)

And for the review and support @skaldarnar 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Build/CI Requests, Issues and Changes targeting gradle, groovy, Jenkins, etc. Type: Improvement Request for or addition/enhancement of a feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance the update-all groovyw utility script further with better logging and resume capability
4 participants