-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dynamic chunk sizing for v4 raw forward index #12945
dynamic chunk sizing for v4 raw forward index #12945
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12945 +/- ##
============================================
+ Coverage 61.75% 62.16% +0.41%
+ Complexity 207 198 -9
============================================
Files 2436 2504 +68
Lines 133233 136710 +3477
Branches 20636 21187 +551
============================================
+ Hits 82274 84991 +2717
- Misses 44911 45416 +505
- Partials 6048 6303 +255
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -38,6 +38,7 @@ | |||
*/ | |||
public class SingleValueVarByteRawIndexCreator implements ForwardIndexCreator { | |||
private static final int DEFAULT_NUM_DOCS_PER_CHUNK = 1000; | |||
private static final int TARGET_MIN_CHUNK_SIZE = 4 * 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lower bound is debatable. 4KB is what we tested with and errs on the side of minimal memory usage, but since it's uncompressed target size the compressed chunk could be below disk read ahead value for many systems
I don't think we need a lower bound for the chunk size. We can probably simply do |
I had the same thought, but we ran into two issues that blocked segment build: maxLength can be 0, and int overflow for large maxLength. Setting a minimum size seemed like a good way to catch both cases.
Makes a lot of sense. I added a config This config can also apply to V2/V3 format with |
|
Nice improvement! |
Done. Updated the description with the new configs. Lmk if there's any concern with naming, I changed it slightly for consistency:
Posting the below just for reference:
I understand why this should be the case, but could not reproduce it in our prod env. I ended up being able to reproduce it in a microbench, and it turned out the scans I initially tested were against very low cardinality data. For higher cardinality data the microbench showed up to 40% faster performance w/ 1M vs 4K chunk size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you help also update the pinot documentation about these 2 configs? Some examples will help user understand how to use them.
} | ||
_targetMaxChunkSize = | ||
targetMaxChunkSize == null ? DataSizeUtils.fromBytes(DEFAULT_TARGET_MAX_CHUNK_SIZE) : targetMaxChunkSize; | ||
_targetDocsPerChunk = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest allowing negative value for this config to turn it off and only honor the max chunk size.
...g/apache/pinot/segment/local/segment/creator/impl/fwd/SingleValueVarByteRawIndexCreator.java
Outdated
Show resolved
Hide resolved
*/ | ||
public static int getDynamicTargetChunkSize(int maxLength, int targetDocsPerChunk, int targetMaxChunkSizeBytes) { | ||
if (targetDocsPerChunk < 0 || (long) maxLength * targetDocsPerChunk > Integer.MAX_VALUE) { | ||
return targetMaxChunkSizeBytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also put a lower bound to this?
"targetMaxChunkSize should only be used when deriveNumDocsPerChunk is true or rawIndexWriterVersion is 4"); | ||
} | ||
_targetMaxChunkSize = | ||
targetMaxChunkSize == null ? DataSizeUtils.fromBytes(DEFAULT_TARGET_MAX_CHUNK_SIZE) : targetMaxChunkSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) Make a constant for DataSizeUtils.fromBytes(DEFAULT_TARGET_MAX_CHUNK_SIZE)
(maybe having both DEFAULT_TARGET_MAX_CHUNK_SIZE
and DEFAULT_TARGET_MAX_CHUNK_SIZE_BYTES
)
public static final ForwardIndexConfig DEFAULT = new Builder().build(); | ||
|
||
@Nullable | ||
private final CompressionCodec _compressionCodec; | ||
private final boolean _deriveNumDocsPerChunk; | ||
private final int _rawIndexWriterVersion; | ||
private final String _targetMaxChunkSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider parsing _targetMaxChunkSizeBytes
upfront to avoid illegal size
|
||
// For columns with very small max value, target chunk size should also be capped to reduce memory during read | ||
int dynamicTargetChunkSize = | ||
ForwardIndexUtils.getDynamicTargetChunkSize(maxLength, targetDocsPerChunk, targetMaxChunkSizeBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this method take numDocsPerChunk instead of targetDocsPerChunk here?
or we can check deriveNumDocsPerChunk, if it's true we also derive dynamicTargetChunkSize otherwise use targetMaxChunkSizeBytes instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is correct. If not configured, targetDocsPerChunk
should be 1000 by default.
Made a small cleanup PR #13093 to clarify the logic a little bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is clearer 🙂
Background
numDocPerChunk * lengthOfLongestEntry
, which was very efficient for near-constant length/short data.4KB
1MB
hardcoded targetProblem
deriveNumDocsPerChunk
is not set) the breakeven point assumes thelengthOfLongestEntry
of a column in a segment is ~1KBWe have seen this behavior first hand after making V4 default internally. We have many columns for which we do not know if they will contain variable length data or ‘short data’, and it's desirable to handle both cases with a single format.
Change
This PR introduces dynamic chunk sizing for V4 format. Target chunk size is calculated based on the heuristic:
where new configs are introduced:
and
TARGET_MIN_CHUNK_SIZE = 4K
In testing I’ve found doing this results in reduced direct memory spikes, especially against wide tables/high QPS. The below graph shows the improvement in direct memory spikes for a env with majority of tables using 3-7 day TTL and adhoc QPS. Some spikes are still present as not all segments with the old static chunk size have been expired (some 30 day TTL tables exist).
I think dynamic chunk sizing should be the default implementation for V4 and have not put this behind a config. It bridges the gap between the variable length data behavior of V4 with the 'short data' behavior of V2/V3.
There are no backward compatibility concerns with this PR.
tags:
performance