-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: Add multi-dimension partitioning doc; refactor native batch and separate into smaller topics. #11983
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really happy to see this page getting split up.. Some suggestions and comments, but all in all, nice one!
@@ -0,0 +1,341 @@ | |||
--- | |||
id: native-batch-firehose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the sidebar.json update is missing. (These new pages aren't in the left nav.)
|
||
For information general information on native batch indexing and parallel task indexing, see [Native batch ingestion](./native-batch.md). | ||
|
||
> Firehose input has been deprecated. For information, see [Firehose](./native-batch-firehose.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to say this here? ("Firehose" doesn't otherwise appear in this page. )
docs/ingestion/native-batch.md
Outdated
|
||
When you use multi-dimension partitioning for your data, Druid is able to distribute segment sizes more evenly than with single dimension partitioning. | ||
|
||
For segment pruning to be effective and translate into better query performance, you must the first of your `partitionDimensions` at query time. For example, given the following `partitionDimensions`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a word(s) is missing here, but not sure what it should be: "...you must __ the first..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for organizing the docs better, @techdocsmith !
Just one minor comment, otherwise LGTM.
"type": "json" | ||
} | ||
}, | ||
"tuningConfig" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should contain a partitionsSpec
as some of these fields like maxRowsPerSegment
are deprecated, as described in the tuningConfig
table below.
Co-authored-by: sthetland <[email protected]>
Co-authored-by: sthetland <[email protected]>
Co-authored-by: sthetland <[email protected]>
Co-authored-by: sthetland <[email protected]>
Co-authored-by: sthetland <[email protected]>
Co-authored-by: sthetland <[email protected]>
Co-authored-by: sthetland <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
LGTM |
Adds documentation for multi-dimension partitioning. cc: @kfaraz
Refactors the native batch partitioning topic as follows:
parallel-index
index
ioSource
This PR has: