-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use API for GitHub latest release strategy #15270
feat: use API for GitHub latest release strategy #15270
Conversation
381f2cf
to
49501c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran out of time today but I'll come back to this tomorrow, as there are some areas that will need to be addressed. Off hand, switching to the API will break some existing GithubLatest
checks but I'll do comparative runs to find out how many (and which/why).
@samford Thanks for doing these! For the bus factor of the project: can you try and get whatever scripts you're using to do this into a |
I saw a couple that failed locally already because they strictly check for a href. Easy to fix those afterwards though once this is done, livechecks being broken only affect a small subset of people. |
For what it's worth, when I do comparative runs there isn't any anything special about the process that precludes anyone from doing the same (i.e., no scripts or arcane knowledge). I've explained the process here and there over the years (Nanda picked it up during GSoC) but it's essentially just diffing multiple
It's typically a manual process (as above) but, sometimes when I'm looking for specific changes, I might pop into It also may go without saying but it's important to make sure that any changes work properly before doing a JSON livecheck run, since the output isn't visible until the end. For example, I did some manual testing before doing the comparative run on this PR branch and I had to incorporate some fixes to make this work as intended. Even with that, there were some other errors that we'll need to address (but it won't require a full re-run in this case). As this PR currently stands, it will break ~104 existing checks out of 650 (across both formulae and casks). Most of these fail because they use a There are a few that fail with an error like I've added lists below for reference (I'll try to update this later to use the full name where there are cask name conflicts): Working -> Different versionListobs-websocket (5.0.1 -> 4.9.1; newest version is 5.0.1; "latest" release is 4.9.1 but description contains a link to 5.0.1 that the existing check matches; need to check all recent releases to identify the highest version) Working ->
|
Thanks @samford!
Turning this into a script or documentation would be very valuable! |
49501c2
to
93078f7
Compare
Since there were no real complaints I enabled automerge |
I'll work on reviewing this today, as there are some changes that I would like to see before this is merged. |
I didn't have as much time today as I wanted, so I'm still working on code changes (hoping to get them out Wednesday but it may be Thursday). In the interim time, one of the things we need to address before merging this or the There was a previous idea in the other PR to simply fail if Warning on larger runs makes sense, as it would allow some basic livecheck usage without a token. We're going to need to keep track of the rate limit response headers and adapt livecheck's behavior to it anyway (i.e., to skip API checks [that are guaranteed to fail] between when the rate limit is exhausted and reset), so assumedly we would display an error when a check fails due to the rate limit. The error message would inform users doing smaller runs without preventing them from using livecheck, so that may be a good balance. [The error message would need to be short, so we may need to add something to the documentation that we can link to (that explains how to resolve the issue, similar to how our manpage explains Extending that idea, we could identify the number of formulae/casks in a run that will use the GitHub API and fail if it's at or above a simple threshold (e.g., 50 without a token, 5,000 with). This wouldn't factor in how much of the rate limit is left (that would require a request to In general, we need to keep in mind that livecheck is also used in other parts of Homebrew ( |
Yeah nah, I don't want to rewrite the way brew deals with rate limits in case someone doesn't have a token set. If we get actual complaints we can always revisit it, but you're currently proposing a shitton of work in unrelated parts of brew for the benefit of avoiding something that will make the brew UX better for users if they do it. I'd rather drop the change request. |
Agreed.
Agreed.
Please don't. @samford I'd love you to show a little more flexibility here. I think what you're asking of @SMillerDev here is a little unreasonable. I'm also in the camp of "let's get to something with no known problems and, if we hit rate limit issues in CI, let's revert or fix forward" rather than blocking this PR on problems that may not actually manifest. |
I looked at
This is fine for Of the previously-mentioned requests, the only one I don't think we can reasonably defer is preventing GitHub API requests once the rate limit is exhausted. This is basic behavior when using any API with a rate limit and the related GitHub documentation specifically states, "If you are rate limited, you should not try your request until after the time specified by the I haven't looked into how we may address that, so I'm not sure whether we can modify the I'll have some time to work on reviewing the changes tomorrow (Thursday). However, before I dig further into reviewing this, it's worth mentioning that the GitHub API isn't the only way of moving I have a local branch where I've modified I believe there are fewer than 10 Thoughts on this alternative approach? |
My thoughts are mostly: GitHub provides an API for exactly what we want why parse headers or html? To avoid a possible rate limit? I just don't see any added value in your alternatives. |
Yep, the rate limit has always been the primary reason we've avoided the GitHub API in livecheck. That and sometimes there are API incidents (making it temporarily inaccessible) while the website is still accessible during those outages. In all the time we've used My opinion has always been that the API would be nice (though not necessary) if we could reasonably use it but the rate limit was always too much of a deterrent to me. There may be some growing pains here but maybe it will mostly be fine if we continue to be careful to only use Anyway, I figured I would mention the alternative, just in case. If it's good with you, I'll push some commits for straightforward changes that won't benefit from discussion (e.g., updating the documentation comments, small changes to align this with other strategies, etc.), to save you from having to incorporate a bunch of suggestions. Anything that we may want to discuss (not a lot, I don't think) I'll create a review comment/suggestion for and I can push commits once we agree. Does that sound good to you? |
In those cases you can use livecheck but not
But it did fail because they updated the website to lazyload downloads.
With the amount of
I think it would be nicer to add that to the |
This only affected casks that were checking release assets and some of them could be updated to simply check the tag instead. After dealing with the issue, we only have a handful where it's necessary to check release assets. But yeah, I agree with this point. It wouldn't have been an issue if we were checking the GitHub API. I said at the time that checking the API is the easiest solution but wanted to look into other options before dealing with the rate limit.
👍
I'm not sure I understand. If you're suggesting merging this as-is and doing follow-up changes in the Merging this will immediately break ~100 existing The change in this PR doesn't provide any immediate benefit but it does create issues, so we should take the time to do this right. |
Then that would be good as a suggestion because I don't see any fundamental changes that it still needs. It works and it works well in my tests. You were previously only talking about trivial changes though, so I'm not sure where the fundamental problems come from and why you haven't mentioned them for over two weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a few commits to take care of simple changes that I don't think would benefit from discussion (comment if you disagree with anything but it all feels very benign). These review suggestions pertain to changes that affect the function of the strategy, one of which is a notable bugfix (i.e., this uses the full regex match as the version instead of the first capture group, so we get some versions like v1.2.3
).
I'll work on the other two items that I mentioned previously when I have a moment. Massaging the RateLimitExceededError
output for non-debug runs should be fairly simple. If preventing requests when we've reached the rate limit ends up being overly complicated or time-consuming, then we can take care of it in a follow-up PR (but we shouldn't put it off indefinitely).
Besides that, I need to work through the livecheck
blocks that this will break and create draft PRs before we merge this. Almost all of them just need very simple regex tweaks, so it shouldn't take much work but I want to take care of it before this is merged (instead of having to put out fires afterward).
868c45e
to
0a84cf7
Compare
`#generate_input_values` is called before `#versions_from_content` (in `#find_versions`), so it makes sense to move it above the latter.
The generated URL should reflect what we're actually checking, which is now the `/releases/latest` API URL.
It's standard for the `match_data` to include the URL (e.g., as in `PageMatch`). This uses the provided URL by default, switching to the generated URL when available.
ce5d565
to
403c005
Compare
Concerns about array vs object are addressed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not convinced that we should check name
(the release title) by default, even as a fallback. I haven't seen any evidence that this would be a beneficial change (while it can be detrimental) and it doesn't solve any problem that we currently face.
It won't apply in 99+% of cases and, if it does, there's a chance it will only serve to surface an incorrect or incomplete version string. For example, here are a few recent "latest" releases in core/cask where the matched version from the title doesn't correspond to the version we need:
- librealsense 2.53.1: The "Intel® RealSense™ SDK 2.0 (v2.53.1)" title contains a matchable version that appears before the software version, so "2.0" would be returned by default instead of the version we need.
- monero 0.18.2.2: The version in the title is "2.2" but the full version we need is "0.18.2.2".
- dungeon-crawl-stone-soup-console 0.30.0: The version in the title is "0.30" but the full version we need is "0.30.0".
- ios-app-signer 1.14: The version in the title is "1.14.0" but, strangely enough, the version we need is "1.14".
I've seen plenty of issues like these in release titles because they're not as strictly formatted as tags. If we ever come across the unlikely case that we need to check release titles for version information instead of tags, it should be handled in a livecheck
block (like any other outlier).
403c005
to
efccc5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've recently had a release and this has been through (too many) rounds of review so let's merge it and, if we cannot fix forward in a timely fashion and this blocks CI and/or we get user reports, we can revert.
Thanks @SMillerDev for the PR and @samford for the detailed review (but try to do it in a single pass next time 🙇🏻).
brew style
with your changes locally?brew typecheck
with your changes locally?brew tests
with your changes locally?Part of #15260
This is just the API switch to make sure we don't have to scrape HTML anymore.