Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dynamodb] Implement stream() on PaginatedQueryList #1528

Closed
woldie opened this issue Mar 29, 2018 · 2 comments
Closed

[dynamodb] Implement stream() on PaginatedQueryList #1528

woldie opened this issue Mar 29, 2018 · 2 comments
Labels
feature-request A feature should be added or improved.

Comments

@woldie
Copy link

woldie commented Mar 29, 2018

PaginatedQueryList inherits the default Collection#stream implementation. The trouble with the default Collection#stream implementation is that it calls PaginatedQueryList#size, forcing a call to PaginatedList#loadAllResults under the covers. This may get your users into serious RCU-consuming trouble especially when they've also done something foolish like set a limit(1) on their DynamoDBQueryExpression. (I would never do such a silly thing, I always read your JavaDocs and understand your API's fully before calling them. ;) )

I cock an eyebrow when I see that PaginatedQueryList implements java.util.List. You're just inviting bad user behavior given that Collection#size may be called at unexpected times, as in the case of default java.util.stream.Stream implementations. PaginatedQueryList should work more like Iterator or java.sql.ResultSet, IMHO. Too many opportunities for leaky abstractions with List. Maybe on version 2 of the API you could change this someday.

Both of my eyebrows go up in shock and horror when I realize that DynamoDBQueryExpression#limit really only translates into server-side fetch size limit for pages, rather than a limit of total number of rows the overall query will fetch. (Okay, I get that "naming things" is one of the hardest things in Computer Science, so I cannot fault you completely on the name of DynamoDBQueryExpression#limit especially given that you fully document its semantics in the JavaDocs that I would have read and understood. Maybe it would be cool if you renamed DynamoDBQueryExpression#limit to something a little less generic like pageItemLimit or pageFetchLimit though?)

So with all of those grievances aired, I would like to recommend at least a partial safeguarding of PaginatedQueryList for users like me that want to blithely create Java 8 streams from a PaginatedQueryList. I would like you to override stream() on PaginatedQueryList to make its Streams more lazy. To do this, I believe you can create a Stream based off of Iterator and that Stream implementation will not call size() under the hood. Something like this could be added to PaginatedQueryList:

public Stream<T> stream() {
  return StreamSupport.stream(Spliterators.spliteratorUnknownSize(this.iterator(), Spliterator.NONNULL), false);
}

If I am right, this would be doing a kindness to your users so please consider adding it. 1.11.x seems to be built for Java 8, so there's no non-money-raking-related reason for you to not make this change, juuuuuuuuuust sayin' ... :/

@dagnir
Copy link
Contributor

dagnir commented Mar 30, 2018

Hi @woldie, thank you for the feedback! Unfortunately, since we target Java 6 for 1.11.x, we cannot implement any of the new Collections methods like stream() for our existing classes. We can't change the semantics of limit() since that would be a backwards compatibility concern for 1.11.x users. As for size() I don't think the network call is unavoidable given the interface contract either unfortuantely.

We are actively working on V2 of the SDK however, and we've recently rolled out a new auto pagination feature built on the Stream and Iterator interfaces that you might be interested in: https://github.com/aws/aws-sdk-java-v2/blob/8630137/CHANGELOG.md#aws-sdk-for-java-v2

You can checkout our test classes for some example usage:

PaginatedQueryList is part of the Document API which is higher level than the DynamoDB Service client, and we're looking to rework that as part of V2 as well. I've capture a link to this issue in a tracking ticket for V2: aws/aws-sdk-java-v2#449

Going to go ahead and close this since we can't change the existing behavior/interfaces in v1. Please feel free to reopen if you have further questions/concerns.

@dagnir dagnir closed this as completed Mar 30, 2018
@dagnir dagnir added the feature-request A feature should be added or improved. label Mar 30, 2018
@woldie
Copy link
Author

woldie commented Apr 2, 2018

@dagnir Aw, darn that you have to be Java 6 backwards compatible. In the new API it looks like you're ticking all the boxes though: Iterable, Streams, and reactive extensions. Very cool! The class names are a little wordy, ListObjectsV2Iterable?? ;)

Anyhow, thanks for reading and responding, I hope my feedback helps you craft an AWSome version 2 in some way. (Ugh, terrible puns!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

2 participants