-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include next_page
and prev_page
metadata on search
#1763
Conversation
☔ The latest upstream changes (presumably #1756) made this pull request unmergeable. Please resolve the merge conflicts. |
As discussed in the May 30th team meeting (https://discordapp.com/channels/442252698964721669/448525639469891595/583749739166695444), this is the first step in deprecating/removing numbered pagination for the "all crates" endpoint. The remaining steps are: - Open issues/PRs on bots which have set a user agent that lets us identify them, have them start following this link - Change our frontend to show "next/prev page" links on the all crates page - Stop returning the "total" meta item when the next/prev page links will be cursor based - Switch the links over to cursor based pagination when applicable - Error when given pagination parameters that would cause very high offsets Since the problem we're addressing only occurs when the offset is greater than ~18k, the only place that we currently need to worry about this is "all crates". There isn't currently any filter you can apply that will return enough rows for this to be a problem (`?letter=` is the most likely, but honestly that is probably the least useful page/endpoint we have and it might just be worth removing it. Either way AFAIK it's not hit by crawlers and humans aren't visiting page 900 in their browsers, so even if we do end up with enough crates for the offset to possibly be too high, it's unlikely anyone will ever hit it there). I'm not entirely sure how I want to structure the final logic for "give me the next/previous page". I've just stuck it at the bottom of the search function for now. We'll want something more structured in the long term, but I'd like to see what this looks like with the cursor based option before pulling it out into a more general abstraction.
73fdf87
to
cf5c284
Compare
@@ -31,7 +32,7 @@ mod prelude { | |||
crate::util::json_response(t) | |||
} | |||
|
|||
fn query(&self) -> HashMap<String, String> { | |||
fn query(&self) -> IndexMap<String, String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexMap
defaults to the RandomState
hasher from std
, so this looks fine from a DOS perspective.
@bors r+ |
📌 Commit cf5c284 has been approved by |
Include `next_page` and `prev_page` metadata on search As discussed in the May 30th team meeting (https://discordapp.com/channels/442252698964721669/448525639469891595/583749739166695444), this is the first step in deprecating/removing numbered pagination for the "all crates" endpoint. The remaining steps are: - Open issues/PRs on bots which have set a user agent that lets us identify them, have them start following this link - Change our frontend to show "next/prev page" links on the all crates page - Stop returning the "total" meta item when the next/prev page links will be cursor based - Switch the links over to cursor based pagination when applicable - Error when given pagination parameters that would cause very high offsets Since the problem we're addressing only occurs when the offset is greater than ~18k, the only place that we currently need to worry about this is "all crates". There isn't currently any filter you can apply that will return enough rows for this to be a problem (`?letter=` is the most likely, but honestly that is probably the least useful page/endpoint we have and it might just be worth removing it. Either way AFAIK it's not hit by crawlers and humans aren't visiting page 900 in their browsers, so even if we do end up with enough crates for the offset to possibly be too high, it's unlikely anyone will ever hit it there). I'm not entirely sure how I want to structure the final logic for "give me the next/previous page". I've just stuck it at the bottom of the search function for now. We'll want something more structured in the long term, but I'd like to see what this looks like with the cursor based option before pulling it out into a more general abstraction.
☀️ Test successful - checks-travis |
This continues the changes made in rust-lang#1763. Since that PR was merged, one of the non-code steps has been taken care of -- All users hitting any endpoint with `?page=20` (which is an arbitrary search pattern that seemed high enough to give any crawlers going through pagination) have been contacted about the change, with a PR opened against any that included a repo. (Intersting aside, there are *zero* records of this for any endpoint other than search, which perhaps implies we can get rid of a few of these endpoints, but that's a separate discussion). This PR does not change any functionality, but moves some code around to better encapsulate things for upcoming changes. Specifically: - Change our frontend to show "next/prev page" links on the all crates page - Stop returning the "total" meta item when the next/prev page links will be cursor based (which I'd actually just like to start omitting in general) The main goal of this change was to stop having any code outside of `Paginated` (which has been renamed to `PaginatedQuery`, as there's a new type called `Paginated`) care about how pagination occurs. This means other code can't care about *how* pagination happens (with the exception of `reverse_dependencies`, which uses raw SQL, and sorta has to... That was a bit of a rabbit hole, see diesel-rs/diesel#2150 for details. Given the low traffic to that endpoint, I think we can safely ignore it). The distribution of responsibilities is as follows: - `PaginatedQuery<T>`: Given the query params, decides how to paginate things, generates appropriate SQL, loads a `Paginated<T>`. - `Paginated<T>`: Handles providing an iterator to the records, getting the total count (to be removed in the near future), providing the next/prev page params - `Request`: Takes the pagination related query params, turns that into an actual URL (note: Due to jankiness in our router, this will only ever be a query string, we have no way of getting the actual path) The next step from here is to change our UI to stop showing page numbers, and then remove the `total` field. This PR will introduce a performance regression that was addressed by rust-lang#1668. That PR was addressing "this will become a problem in a future", not "this is about to take the site down". Given the intent to remove the `total` field entirely, I think it is fine to cause this regression in the short term. If we aren't confident that the changes to remove this field will land in the near future, or want to be conservative about this, I can add some additional complexity/git churn to retain the previous performance characteristics
Refactor pagination to prepare for cursor-based pagination. This continues the changes made in #1763. Since that PR was merged, one of the non-code steps has been taken care of -- All users hitting any endpoint with `?page=20` (which is an arbitrary search pattern that seemed high enough to give any crawlers going through pagination) have been contacted about the change, with a PR opened against any that included a repo. (Intersting aside, there are *zero* records of this for any endpoint other than search, which perhaps implies we can get rid of a few of these endpoints, but that's a separate discussion). This PR does not change any functionality, but moves some code around to better encapsulate things for upcoming changes. Specifically: - Change our frontend to show "next/prev page" links on the all crates page - Stop returning the "total" meta item when the next/prev page links will be cursor based (which I'd actually just like to start omitting in general) The main goal of this change was to stop having any code outside of `Paginated` (which has been renamed to `PaginatedQuery`, as there's a new type called `Paginated`) care about how pagination occurs. This means other code can't care about *how* pagination happens (with the exception of `reverse_dependencies`, which uses raw SQL, and sorta has to... That was a bit of a rabbit hole, see diesel-rs/diesel#2150 for details. Given the low traffic to that endpoint, I think we can safely ignore it). The distribution of responsibilities is as follows: - `PaginatedQuery<T>`: Given the query params, decides how to paginate things, generates appropriate SQL, loads a `Paginated<T>`. - `Paginated<T>`: Handles providing an iterator to the records, getting the total count (to be removed in the near future), providing the next/prev page params - `Request`: Takes the pagination related query params, turns that into an actual URL (note: Due to jankiness in our router, this will only ever be a query string, we have no way of getting the actual path) The next step from here is to change our UI to stop showing page numbers, and then remove the `total` field. This PR will introduce a performance regression that was addressed by #1668. That PR was addressing "this will become a problem in a future", not "this is about to take the site down". Given the intent to remove the `total` field entirely, I think it is fine to cause this regression in the short term. If we aren't confident that the changes to remove this field will land in the near future, or want to be conservative about this, I can add some additional complexity/git churn to retain the previous performance characteristics
As discussed in the May 30th team meeting
(https://discordapp.com/channels/442252698964721669/448525639469891595/583749739166695444),
this is the first step in deprecating/removing numbered pagination for
the "all crates" endpoint. The remaining steps are:
identify them, have them start following this link
page
will be cursor based
offsets
Since the problem we're addressing only occurs when the offset is
greater than ~18k, the only place that we currently need to worry about
this is "all crates". There isn't currently any filter you can apply
that will return enough rows for this to be a problem (
?letter=
isthe most likely, but honestly that is probably the least useful
page/endpoint we have and it might just be worth removing it. Either way
AFAIK it's not hit by crawlers and humans aren't visiting page 900 in
their browsers, so even if we do end up with enough crates for the
offset to possibly be too high, it's unlikely anyone will ever hit it
there).
I'm not entirely sure how I want to structure the final logic for "give
me the next/previous page". I've just stuck it at the bottom of the
search function for now. We'll want something more structured in the
long term, but I'd like to see what this looks like with the cursor
based option before pulling it out into a more general abstraction.