feat: add github_release strategy #15260

SMillerDev · 2023-04-18T13:44:05Z

Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same change?
Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your changes? Here's an example.
Have you successfully run brew style with your changes locally?
Have you successfully run brew typecheck with your changes locally?
Have you successfully run brew tests with your changes locally?

This adds a middle way between the git strategy and the github_latest strategy allowing us to check releases even if there are multiple release streams.

Library/Homebrew/livecheck/strategy/github_release.rb

Library/Homebrew/livecheck/strategy/github_latest.rb

Library/Homebrew/livecheck/strategy/github_release.rb

samford

Before I dig into reviewing this, one fundamental issue here is the use of the GitHub API. We've specifically avoided using it in livecheck strategies/blocks for years because of the rate limit involved. We had a relatively small number of checks using the GitHub API back in the Homebrew/homebrew-livecheck days but the checks would periodically fail in relation to the rate limit.

If I'm remembering correctly, the unauthenticated rate limit is 60 requests/hour and authenticated is 5,000/hour. Simply using homebrew/core for comparison, we have ~3100 formulae using the Git strategy in relation to a GitHub repository and ~275 formulae using GithubLatest. The high priority of the GithubRelease strategy here will cause it to be used for those ~3100 formulae that have been using Git up to now, so we're looking at nearly ~3,400 GitHub API requests for a homebrew/core run.

Unless you're authenticated and you have enough rate limit remaining to be able to execute all those checks, you'll seemingly end up with a ton of errors instead of meaningful data. The changes in this PR would essentially require livecheck users to set a GitHub API token to use livecheck in a meaningful way (since the unauthenticated limit is painfully low). [It probably goes without saying but we would want to keep track of the rate limit within livecheck and skip checks once the limit is reached (until the reset time), so we wouldn't burden the API with failing requests.]

Another related issue is that we run brew livecheck on core/cask CI as part of tests and we would need to make sure that GitHub API requests from livecheck don't impact the rate limit for our other API requests. The last thing we want is for livecheck to eat the rate limit and cause CI to fail.

That said, I'm currently working on doing comparative runs using the ~3,400 affected formulae to observe the changes (I'll go through casks after). I'm seeing errors when running related checks (unrelated to rate limit), so I'll come back with some data once I've got this working enough to run the checks without erroring.

SMillerDev · 2023-04-18T15:47:18Z

Unless you're authenticated and you have enough rate limit remaining to be able to execute all those checks, you'll seemingly end up with a ton of errors instead of meaningful data. The changes in this PR would essentially require livecheck users to set a GitHub API token to use livecheck in a meaningful way (since the unauthenticated limit is painfully low).

Without it it will be hard to do anything with the results and it's a debug command, I don't see this as a bad thing.

samford

Across the ~3,100 related checks in homebrew/core (excluding the 276 GithubLatest checks), this PR causes ~541 checks (~17%) that work properly with Git to produce an error instead (e.g., abcm2ps, acl2, afio, etc.). [2,356 checks (~76%) return the same version with Git and GithubRelease.]

497 of these failures are because the repository has no releases. We would have to add/modify livecheck blocks to explicitly use strategy :git, due to GithubRelease's high priority [taking precedence over Git].
10 of the errors are because the only available releases are old versions, unstable versions, etc., so we have to check the tags to properly identify versions.
30 of these errors are because the GithubRelease strategy is using the standard regex for tags like 1.2.3/1.2.3 as the default and these repositories use an incompatible tag format (e.g., 20190702, 2023-04-12, r56, something-131, 2020-12-R1, etc.). The Git strategy's permissive default regex (/\D*(.+)/) works in almost all of these 30 cases. [On at least one occassion, I've explained in depth why we use a permissive default regex in Git instead of the standard regex (though it was a homebrew/core PR I've lost track of) and I still think it's the better default with respect to tag strings.]

197 checks (~6%) return a different newest upstream version when using GithubRelease compared to Git. 109 are a lower version than the Git version and 88 are higher. I haven't gone through these to determine the cause of the differences but it's likely some mixture of:

Checking name before tag_name (instead of the opposite)
Also checking the release name instead of only tag_name. [Using only tag_name would replicate the behavior of the Git strategy and allow existing regexes to work as expected.]
The more-specific strategy regex, as mentioned above
A tagged version not having a corresponding release. In some cases, the latest release version is what we want (i.e., the different version is more correct) and in other cases we have to check tag versions to get the correct version (i.e., the different version is incorrect).
A small number of repositories inevitably releasing an update in between runs (i.e., these changes aren't meaningful)

In the current state, the changes in this PR would:

Break hundreds of checks while only making a noticeable difference for a smaller number of checks
- Require us to add/modify hundreds of livecheck blocks to prevent/fix breakages (this would need to be done before merging)
Introduce rate limiting, which stands to be a pain for someone like me, as I do full tap runs on a regular basis (often back-to-back when evaluating livecheck changes like this). It's also not clear how we could reasonably avoid broken checks when we have enough related checks to exceed 5,000 API requests in a single homebrew/core or homebrew/cask tap run. We're currently at ~3,400 requests for homebrew/core (~1,200 for homebrew/cask) and this strategy would see the most growth as new formulae/casks are added, so it's not an idle concern.
Dramatically slow down related checks (GithubRelease can take several seconds for each check, whereas Git can run several checks in the same amount of time)
- I need to add some temporary timing logic to livecheck and re-run the checks to determine exactly how much slower GithubRelease is compared to Git
Increase the amount of fetched data involved with related checks (outside of a handful of repositories with an obscene number of tags), since the response includes a lot of release information that we don't usually need
- Similarly, I need to re-run the checks with my usual temporary logic that tracks the size of the response content to get a hard number (though those numbers don't factor in compression)

The current approach for these checks (Git, unless GithubLatest is required/appropriate) is working fine in the vast majority of cases, often without a livecheck block. Git is faster/lighter, doesn't involve rate limiting, etc., so I'm not convinced that GithubRelease would be a better default. Checking releases instead of tags only appears to make a meaningful difference in a small number of these checks, so I'm against using GithubRelease as a default strategy that's prioritized over Git.

In general, I would prefer not to use the GitHub API when we can avoid it (as we've been doing for years). If we end up moving forward with GithubRelease, I would rather see it manually applied only where it makes a difference (like GithubLatest), rather than automatically applying it to ~4,600 of formulae/casks.

MikeMcQuaid · 2023-04-19T09:16:50Z

Unless you're authenticated and you have enough rate limit remaining to be able to execute all those checks, you'll seemingly end up with a ton of errors instead of meaningful data. The changes in this PR would essentially require livecheck users to set a GitHub API token to use livecheck in a meaningful way (since the unauthenticated limit is painfully low).
Without it it will be hard to do anything with the results and it's a debug command, I don't see this as a bad thing.

I agree with this but we should probably actively fail without a token set if this is the case (or at least if you're querying more than one formula/cask).

MikeMcQuaid · 2023-04-19T09:19:10Z

On at least one occassion, I've explained in depth why we use a permissive default regex in Git instead of the standard regex

@samford I appreciate your lengthy comments in PRs and issues but: this is an example of how they are sometimes misplaced. If you're having to explain the same thing multiple times about how/why the code behaves a certain way then:

ideally the code will make this more obvious without comments
where comments are necessary, they must be short, concise and explain the reasoning here

Livechecks in general have a lot of documentation (which is great!) but sometimes they explain the obvious and avoid things like this that people are actually asking you about multiple times.

Hope that's helpful feedback ❤️

MikeMcQuaid · 2023-04-19T09:20:09Z

as I do full tap runs on a regular basis (often back-to-back when evaluating livecheck changes like this)

This should not be something we're optimising for as it's so far from the typical user experience. Similarly, this should be something we use automated tests for rather than relying on a single maintainer to do it manually.

MikeMcQuaid · 2023-04-19T09:21:18Z

so I'm against using GithubRelease as a default strategy that's prioritized over Git.
I would rather see it manually applied only where it makes a difference (like GithubLatest

This seems like a good suggestion. @SMillerDev any objections to this? Can we move forward with this for now?

SMillerDev · 2023-04-19T11:14:41Z

Yeah, seems fine. I've split out the migration to the API to make this review clearer though, so I'm focusing on that one first

SMillerDev · 2023-05-15T19:48:44Z

Rebased and pushed. Ready for some reviews

samford · 2023-05-16T05:51:17Z

I was thinking about rebasing this earlier today, so thanks for taking care of it. I've gone through the changes and they're generally what I expected. This basically inherits the logic that we already worked through in the recent GithubLatest PR, so I've simply pushed some commits to address areas that were overlooked or still needed to be updated (e.g., documentation comments, type signatures, etc.).

I also added tests for GithubRelease that cover all of the code outside of #find_versions. [One of my upcoming PRs will cover the parts of #find_versions that don't involve a network request, so I'll update that work to include this strategy.]

The only other remaining change that I would like to discuss is the strategy name. This strategy checks the releases API endpoint for a repository and that returns information for multiple recent releases.

With the API URL in mind (i.e., releases) and that the response JSON contains an array of release objects, I strongly think that this should be named GithubReleases instead. GithubRelease makes it sound like it would only return one release (e.g., like GithubLatest) and this may confuse users about its purpose and behavior.

If we can agree on this name change, I can go through this in the morning (EDT) to update all the instances of GithubRelease, github_release, etc.

Past that, I tested this across existing GithubLatest checks and those continue to work the same. I also did some manual testing of GithubReleases with a cask that it could be used for (replacing an existing livecheck block that manually iterates through releases using multiple requests) and it works as expected.

MikeMcQuaid · 2023-05-16T16:34:43Z

I strongly think that this should be named GithubReleases instead

I'm fine either way. If @SMillerDev agrees: go for it.

SMillerDev · 2023-05-16T17:03:07Z

I have no strong feelings about it so: sure, let's do it

samford · 2023-05-16T17:56:17Z

I was just about to push a commit with the changes (I just finished testing). A little more is required than what your commit covers, so do you mind if I overwrite it (i.e., force push) or would you prefer a follow-up commit instead?

Edit: I went ahead and created a follow-up commit with the remaining changes. I also tweaked my previous commit that added tests to modify the require call (to address some errors), hence the force push.

`GithubLatest` was updated to use parts of `GithubRelease` and this works fine within `brew livecheck` (since we `require` all the strategies) but we need to explicitly `require` `GithubRelease` in the `GithubLatest` test file now. Without this, we encounter errors in parts of `GithubLatest` where `GithubRelease` is referenced.

The initial documentation comments contained some remaining text referring to `GithubLatest` and hadn't been updated to incorporate the recent changes to the aforementioned strategy. This also reworks some of the language to better explain the strategy's function, application, etc.

Keeping the logic for generating the API URL in a method makes it testable, aligns with other strategies, and will help to enable some future work.

The `#versions_from_content` method requires a regex and this will be enforced by the type signature, so we don't have to check for the presence of a regex when handling a `strategy` block.

This adds tests to cover all of the strategy outside of the `#find_versions` method, which we don't currently test because it involves a network request.

samford · 2023-05-16T19:23:59Z

The most recent commit updates the main brew livecheck documentation to account for the recent GithubLatest/GithubReleases changes. Unless I'm forgetting something, I think that should be the last of this. Feel free to take a look through the recent changes but, if nothing else requires changes, I'll approve this once CI is green.

Past that, one thing I forgot to mention is that we also have a Homebrew::GitHubReleases class (note the capital H), so we may need to be a little careful to avoid a name collision. I've standardized existing references on the GithubReleases name (no capital H) and I think we should continue using that by default (instead of the GitHubReleases alias), as it may naturally help to avoid a collision and potentially reduce confusion.

Edit: As soon as I approved this, of course I noticed that I missed .rubocop.yml (i.e., ripgrep was omitting it since it's a [hidden] dot file). Updated.

SMillerDev requested review from samford and nandahkrishna April 18, 2023 13:44

SMillerDev force-pushed the feature/livecheck/github_release_strategy branch 2 times, most recently from 00e0492 to c1b7c52 Compare April 18, 2023 14:13

MikeMcQuaid reviewed Apr 18, 2023

View reviewed changes

samford reviewed Apr 18, 2023

View reviewed changes

samford reviewed Apr 19, 2023

View reviewed changes

SMillerDev mentioned this pull request Apr 19, 2023

feat: use API for GitHub latest release strategy #15270

Merged

7 tasks

SMillerDev marked this pull request as draft April 19, 2023 09:11

samford mentioned this pull request Apr 29, 2023

Livecheck: Fixes for Sorbet runtime #15334

Merged

7 tasks

SMillerDev force-pushed the feature/livecheck/github_release_strategy branch 2 times, most recently from a3c7a07 to c92f7af Compare May 2, 2023 13:03

SMillerDev marked this pull request as ready for review May 2, 2023 13:16

samford mentioned this pull request May 5, 2023

GitHub::API: Allow passing options to curl #15364

Closed

7 tasks

SMillerDev force-pushed the feature/livecheck/github_release_strategy branch from c92f7af to edf2f10 Compare May 15, 2023 19:47

MikeMcQuaid approved these changes May 16, 2023

View reviewed changes

SMillerDev and others added 3 commits May 16, 2023 14:18

feat: add github_release strategy

5139b9b

GithubLatest: Tweak documentation comments

1dd7556

samford and others added 7 commits May 16, 2023 14:18

GithubRelease: Omit unnecessary prefix

8a0f63e

GithubRelease: Update type signature

701f7c5

GithubRelease: Add #generate_input_values method

263f486

Keeping the logic for generating the API URL in a method makes it testable, aligns with other strategies, and will help to enable some future work.

GithubRelease: Simplify block handling

a2fb2b0

The `#versions_from_content` method requires a regex and this will be enforced by the type signature, so we don't have to check for the presence of a regex when handling a `strategy` block.

GithubRelease: Add tests

bed4737

This adds tests to cover all of the strategy outside of the `#find_versions` method, which we don't currently test because it involves a network request.

chore: rename GitHubRelease to GitHubReleases

940b63c

samford force-pushed the feature/livecheck/github_release_strategy branch 2 times, most recently from 656547e to 8c91280 Compare May 16, 2023 19:19

samford approved these changes May 16, 2023

View reviewed changes

samford added 2 commits May 16, 2023 16:08

Finish renaming GithubRelease to GithubReleases

104d30d

Update brew livecheck documentation

9b6e8f8

samford force-pushed the feature/livecheck/github_release_strategy branch from 8c91280 to 9b6e8f8 Compare May 16, 2023 20:08

SMillerDev merged commit 49b4a79 into Homebrew:master May 17, 2023

SMillerDev deleted the feature/livecheck/github_release_strategy branch May 17, 2023 09:10

github-actions bot added the outdated PR was locked due to age label Jun 17, 2023

github-actions bot locked as resolved and limited conversation to collaborators Jun 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add github_release strategy #15260

feat: add github_release strategy #15260

SMillerDev commented Apr 18, 2023

samford left a comment •

edited

Loading

SMillerDev commented Apr 18, 2023

samford left a comment

MikeMcQuaid commented Apr 19, 2023

MikeMcQuaid commented Apr 19, 2023

MikeMcQuaid commented Apr 19, 2023

MikeMcQuaid commented Apr 19, 2023

SMillerDev commented Apr 19, 2023

SMillerDev commented May 15, 2023

samford commented May 16, 2023

MikeMcQuaid commented May 16, 2023

SMillerDev commented May 16, 2023

samford commented May 16, 2023 •

edited

Loading

samford commented May 16, 2023 •

edited

Loading

feat: add github_release strategy #15260

feat: add github_release strategy #15260

Conversation

SMillerDev commented Apr 18, 2023

samford left a comment • edited Loading

Choose a reason for hiding this comment

SMillerDev commented Apr 18, 2023

samford left a comment

Choose a reason for hiding this comment

MikeMcQuaid commented Apr 19, 2023

MikeMcQuaid commented Apr 19, 2023

MikeMcQuaid commented Apr 19, 2023

MikeMcQuaid commented Apr 19, 2023

SMillerDev commented Apr 19, 2023

SMillerDev commented May 15, 2023

samford commented May 16, 2023

MikeMcQuaid commented May 16, 2023

SMillerDev commented May 16, 2023

samford commented May 16, 2023 • edited Loading

samford commented May 16, 2023 • edited Loading

samford left a comment •

edited

Loading

samford commented May 16, 2023 •

edited

Loading

samford commented May 16, 2023 •

edited

Loading