Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IntGrouper to avoid unnecessary boxing/unboxing in array-based aggregation #4668

Merged
merged 4 commits into from
Aug 10, 2017

Conversation

jihoonson
Copy link
Contributor

@jihoonson jihoonson commented Aug 9, 2017

IIRC, this was discussed in #4576.


This change is Reviewable

@jihoonson jihoonson added this to the 0.11.0 milestone Aug 9, 2017
Copy link
Member

@leventov leventov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also benchmark results would be useful

/**
* {@link Grouper} specialized for the primitive int type
*/
public interface IntGrouper extends Grouper<Integer>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this interface have only one implementation, maybe get rid of it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking to add a new grouper which doing merge aggregation for sorted data. This interface will be worthwhile.

*/
public interface IntGrouper extends Grouper<Integer>
{
default AggregateResult aggregate(int key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that this default method is used only from other default method, suggested to inline it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is used in ArrayAggregateIterator as well.

}

@Override
default IntGrouperHashFunction hashFunction()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only implementation overrides this method, suggested to remove default implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Removed.

@jihoonson
Copy link
Contributor Author

@leventov sure, here it is. I added a new benchmark query because there wasn't a query having a single grouping key except simple series queries which cause a lot of GCs due to too large cardinality of the grouping column. Not dramatically fast, but looks worthwhile.

master

# Run complete. Total time: 00:01:03

Benchmark                                   (defaultStrategy)  (initialBuckets)  (numProcessingThreads)  (numSegments)  (queryGranularity)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt      Score     Error  Units
GroupByBenchmark.querySingleQueryableIndex                 v2                -1                       1              1                 all           1000000  basic.singleZipf  avgt   30  72984.855 ± 258.569  us/op

patch

Benchmark                                   (defaultStrategy)  (initialBuckets)  (numProcessingThreads)  (numSegments)  (queryGranularity)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt      Score     Error  Units
GroupByBenchmark.querySingleQueryableIndex                 v2                -1                       1              1                 all           1000000  basic.singleZipf  avgt   30  70874.918 ± 396.666  us/op

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@gianm gianm merged commit 65c1d6c into apache:master Aug 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants