Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal with potential cardinality estimate being negative and add logging to hash determine partitions phase #12443

Merged
merged 5 commits into from
May 20, 2022

Conversation

loquisgon
Copy link

@loquisgon loquisgon commented Apr 16, 2022

We have seen rare instances on the wild where during hash partitions the determine cardinality phase produces a single shard with a large segment without regards to the value of maxRowsPerSegment. Attempts to reproduce have not been successful so adding some defensive programming and logging in case this happens again would be helpful. This PR deals with the improbable case where the estimate from a Union HLLSketch that is used in the code returns negative. It also adds some logging to report the values used to come up with the final determination.

This PR has:

  • [X ] been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • [X ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • [ X] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • [X ] been tested in a test Druid cluster.

Copy link
Contributor

@somu-imply somu-imply left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Minor nit s

// I don't think we can use the estimate in any way being negative, seven sounds like a nice prime number
// it is ok if we end up not filling them all, the ingestion code handles that
// Seven on the other hand will at least create some shards rather than potentially a single huge one
estimatedNumShards = 7L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of setting this 7 here, we should move this as a final static variable up top. Something like DEFAULT_NUM_SHARDS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question, why we went with 7 and say why not 3 ? Any rationale behind it ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also make this configurable with a default of 7.

Copy link
Author

@loquisgon loquisgon May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is that three is too small but it is all arbitrary. This is not really a default, it is just for the remote, never actually verified to be observed, case that estimate is negative. So I would like to leave as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the case that we get into this situation, should the fall back here be configurable, so that if 7 results in a bad estimate / default value, the job can be rerun with a different value and produce potentially better results?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fall again here we should collect evidence and fix it for good.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to enforce collect and fix we could also throw an ISE here so the context is repeatable...how does this sound instead of the guesstimate of seven shards? Rather than guesstimating just throw an ISE and halt. This may be too harsh so the warning is better I think but stop there and not try to be more clever. Let's think of this as some sort of fishing expedition for data to see if this was the original problem. There is no evidence and those of us that have tried have not been able to reproduce the scenario.

Copy link
Contributor

@zachjsh zachjsh May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, yeah failing the ingestion completely may be extreme. I think best is to allow user to set a fallback num shards value if it cant be computed / estimated properly. Maybe a negative value for the config could indicate to fail the ingestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that complicates things, so your choice to add or not. Would be good to allow some control here, but not a blocker.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the magic number behind a label.


// This is for potential debugging in case we suspect bad estimation of cardinalities etc,
LOG.debug("intervalToNumShards: %s", intervalToNumShards.toString());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from putting it in the log, do we need a metric around it too ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often would this log message be hit? We rarely turn on debug logging until after an issue is seen. If its loggign here wont be too much, could we move to info level?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar log message was there before and it was explicitly removed by a previous change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a metric here could be good, as @somu-imply suggested

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have instrumented the Task's to make it easier to emit metrics... we can delay this PR until this other PR with the metrics instrumentation merges and then I can easily add the metric.

// determine numShards based on maxRowsPerSegment and the cardinality
estimatedNumShards = Math.round(estimatedCardinality / maxRowsPerSegment);
}
LOG.debug("estimatedNumShards %d given estimated cardinality %.2f and maxRowsPerSegment %d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like useful log, if not logged too often, maybe we can make it info level?

Copy link
Author

@loquisgon loquisgon May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm, the goal of this was to do something for the bad estimate not for better logging? Let me think...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many depends on how many shards were created per interval. This code will. union all the independent shards created from the parallel sub-tasks in order to create the final shard. I am being cautious again because before such logging was considered excessive.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it info based on code review...I agree it is useful to have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the logging was deemed to excessive before and explicitly removed, then maybe thats good reason to keep it debug. I'm not sure on the details though? Still not clear to me how often this log would be written.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not easy to know how frequent it would be logged without coming up with some data models and retrieving some data to understand more the real distributions of hash buckets in a given time chunk for a give set of dimensions. This is one of these things were experience and/or experimentation teaches you better IMO. So turning it on again and watching it seems like the right thing to do this time.

Copy link
Contributor

@zachjsh zachjsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@somu-imply somu-imply left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@loquisgon loquisgon merged commit c236227 into apache:master May 20, 2022
@abhishekagarwal87 abhishekagarwal87 added this to the 24.0.0 milestone Aug 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants