-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paginate methods that return iterators #196
Comments
Some thoughts regarding the API. Current stateCurrently (1.8) , Reader methods that return multiple things return a plain iterator: def get_things(
...
) -> Iterator[Thing]: ... There is no way to start this iterator from the middle. The underlying (private) Storage API already supports pagination; it looks something like: def get_things_page(
...,
chunk_size: Optional[int] = None,
last: Optional[T] = None
) -> Iterable[Tuple[Thing, T]]: ...
The current implementation uses these with scrolling window queries. The alternative, limit+offset queries, tends to cause performance issues, so the API should not be based on them. Note that
Other implementations that support MVCC might not do this; Possible APIsNew Reader methodsThe simplest thing that would work is to expose
Augumented iterableAlternatively, we could make This model is somewhat similar to the Stripe API pagination; we already have "auto-pagination" (the iterator gets the next page automatically), we need to expose normal pagination. Object id (primary key)An interesting thing Stripe does is that the type of its The only downside is that we need to do a bit of additional work to get the extra attributes. It would be nice to have a uniform way of getting the value of
However, we can add a uniform way of getting Also, we could postpone adding pagination for feed metadata and tags until 2.0, when we can just change the return types. For search_entries(), The type of lastIn Python-land, our current use of a
Signed bytes / strA possible solution would be to use ItsDangerous to serialize Note that by default ItsDangerous only supports JSON-serializable types. Flask's TaggedJSONSerializer allows serializing additional types like tuple and datetime; if we don't want to depend on / vendor that, we could write our own (prototype). Object id (primary key)Using the object id as
Argument namesI am not sure Looking at some other things for inspiration:
I think limit + starting_after is the best combination. This way, we can add ending_before later, and use cursor/marker for the opaque string if it's useful. Storage private paginationHow does this interact with the (private) pagination of the current SQLite Storage implementation? The reasons for internal pagination remain valid: avoid locking the database, avoid consuming too much memory (#167); paginated calls should still use the internal pagination. Some optimizations we can do later:
Initial implementation notesWhat do we implement?Pagination for get_feeds(), get_entries(), and search_entries(). A property returning the object id for the corresponding type; #159 has a discussion on naming that may be relevant. We can do tags and metadata later, both because of the issues mentioned above, and because they're less expensive to call as-is (they are likely fewer and have less data). What changes are required to support it?Storage methods to go from "object id" to Whatever changes are needed for the optimzations described in Storage private pagination. How do we test it?Individual tests for limit and start_after. Parametrized tests; some options (not mutually exclusive):
|
To do:
|
We're not gonna do the parametrized tests at the moment, test_reader.py isn't organized enough. |
Requested in #192.
The underlying storage API is already paginated, we just need to find a nice way of exposing it.
The text was updated successfully, but these errors were encountered: