-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataSegment memory optimizations #5094
Conversation
…trics lists as a whole) more aggressively; use ArrayMap instead of default LinkedHashMap for DataSegment.loadSpec, because they have only 3 entries on average; prune DataSegment.loadSpec on brokers
Is it expected that we have two almost identical classes called "DataSegmentTest" in different modules? |
If they are testing different things and can't be merged into single class, then its fine. Or else, please merge them.
Can we just drop them from announcements made from historicals/realtime-processes instead ? I think @nishantmonu51 already made a change at some point to optionally drop metric/dimension names. |
@himanshug could you point to a PR? |
@himanshug thanks. I think it's different because it doesn't affect the presence of the data on coordinator endpoints. And it's a more decoupled approach. |
There are endpoints at coordinator ( https://github.com/druid-io/druid/blob/master/server/src/main/java/io/druid/server/http/MetadataResource.java ) to get source of truth details of a Segment information from DB. In fact I would change even the default behavior to not publish dimension, metrics, loadSpecs information in announcements so that all users get that. does anyone else have an opinion ? @gianm @drcrallen @nishantmonu51 |
it was decided in dev-sync today to go with current state of this PR wherein historicals announce the loadSpec and they are dropped by broker only. |
…rename TestHelper.getJsonMapper() to makeJsonMapper()
@himanshug @gianm could you please take a look? |
@leventov i'll take a look now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, couple minor comments.
return interner.intern(input); | ||
} | ||
}; | ||
private static final Interner<String> stringInterner = Interners.newWeakInterner(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well CAPITALIZE these if you are renaming them anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
private static final Interner<List<String>> dimensionsInterner = Interners.newWeakInterner(); | ||
private static final Interner<List<String>> metricsInterner = Interners.newWeakInterner(); | ||
private static final Function<String, String> internFun = Interners.asFunction(stringInterner); | ||
private static final Map<String, Object> PRUNED_LOAD_SPEC = ImmutableMap.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any APIs on the Broker that return DataSegment objects? If so, this needs a mention in release notes at least.
private static final Interner<String> stringInterner = Interners.newWeakInterner(); | ||
private static final Interner<List<String>> dimensionsInterner = Interners.newWeakInterner(); | ||
private static final Interner<List<String>> metricsInterner = Interners.newWeakInterner(); | ||
private static final Map<String, Object> PRUNED_LOAD_SPEC = ImmutableMap.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any APIs on the broker that return DataSegment objects? If so, this needs at least a mention on the release notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also suggest adding a note to the docs for any relevant methods, something like "This method does not return loadSpecs for segments; if you need those then you can use the Coordinator API XXX."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, as far as I can tell. It's exposed only in MetadataResource
and DatasourcesResource
, which are present only on Coordinator node.
@gianm do you have more comments on this PR? |
@leventov LGTM; sorry, am a few days behind on github email at the moment. |
DataSegment
contents (loadSpec's keys, dimensions and metrics lists as a whole) more aggressively;ArrayMap
instead of defaultLinkedHashMap
for DataSegment.loadSpec, because they have only 3 entries on average;