-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deal with potential cardinality estimate being negative and add logging to hash determine partitions phase #12443
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Minor nit s
// I don't think we can use the estimate in any way being negative, seven sounds like a nice prime number | ||
// it is ok if we end up not filling them all, the ingestion code handles that | ||
// Seven on the other hand will at least create some shards rather than potentially a single huge one | ||
estimatedNumShards = 7L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of setting this 7 here, we should move this as a final static variable up top. Something like DEFAULT_NUM_SHARDS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question, why we went with 7 and say why not 3 ? Any rationale behind it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also make this configurable with a default of 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that three is too small but it is all arbitrary. This is not really a default, it is just for the remote, never actually verified to be observed, case that estimate is negative. So I would like to leave as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in the case that we get into this situation, should the fall back here be configurable, so that if 7 results in a bad estimate / default value, the job can be rerun with a different value and produce potentially better results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we fall again here we should collect evidence and fix it for good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to enforce collect and fix we could also throw an ISE here so the context is repeatable...how does this sound instead of the guesstimate of seven shards? Rather than guesstimating just throw an ISE and halt. This may be too harsh so the warning is better I think but stop there and not try to be more clever. Let's think of this as some sort of fishing expedition for data to see if this was the original problem. There is no evidence and those of us that have tried have not been able to reproduce the scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, yeah failing the ingestion completely may be extreme. I think best is to allow user to set a fallback num shards value if it cant be computed / estimated properly. Maybe a negative value for the config could indicate to fail the ingestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that complicates things, so your choice to add or not. Would be good to allow some control here, but not a blocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put the magic number behind a label.
|
||
// This is for potential debugging in case we suspect bad estimation of cardinalities etc, | ||
LOG.debug("intervalToNumShards: %s", intervalToNumShards.toString()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from putting it in the log, do we need a metric around it too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often would this log message be hit? We rarely turn on debug logging until after an issue is seen. If its loggign here wont be too much, could we move to info level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar log message was there before and it was explicitly removed by a previous change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a metric here could be good, as @somu-imply suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have instrumented the Task
's to make it easier to emit metrics... we can delay this PR until this other PR with the metrics instrumentation merges and then I can easily add the metric.
// determine numShards based on maxRowsPerSegment and the cardinality | ||
estimatedNumShards = Math.round(estimatedCardinality / maxRowsPerSegment); | ||
} | ||
LOG.debug("estimatedNumShards %d given estimated cardinality %.2f and maxRowsPerSegment %d", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like useful log, if not logged too often, maybe we can make it info level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm, the goal of this was to do something for the bad estimate not for better logging? Let me think...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many depends on how many shards were created per interval. This code will. union all the independent shards created from the parallel sub-tasks in order to create the final shard. I am being cautious again because before such logging was considered excessive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it info based on code review...I agree it is useful to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the logging was deemed to excessive before and explicitly removed, then maybe thats good reason to keep it debug. I'm not sure on the details though? Still not clear to me how often this log would be written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not easy to know how frequent it would be logged without coming up with some data models and retrieving some data to understand more the real distributions of hash buckets in a given time chunk for a give set of dimensions. This is one of these things were experience and/or experimentation teaches you better IMO. So turning it on again and watching it seems like the right thing to do this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We have seen rare instances on the wild where during hash partitions the determine cardinality phase produces a single shard with a large segment without regards to the value of
maxRowsPerSegment
. Attempts to reproduce have not been successful so adding some defensive programming and logging in case this happens again would be helpful. This PR deals with the improbable case where the estimate from aUnion
HLLSketch
that is used in the code returns negative. It also adds some logging to report the values used to come up with the final determination.This PR has: