-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving optimizer performance by eliminating unnecessary sort and distribution passes, add more SymmetricHashJoin improvements #5754
Merged
alamb
merged 28 commits into
apache:main
from
synnada-ai:performance/remove-enforcesorting
Apr 4, 2023
Merged
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
b25bc46
Increase optimizer performance
metesynnada b5eb32b
Config added.
metesynnada 575a6d7
Simplifications and comment improvements
ozankabak 05f768c
More simplifications
ozankabak 90d82df
Revamping tests for unbounded-unbounded cases.
metesynnada 8886457
Review code
metesynnada 36d450e
Move SHJ suitability from PipelineFixer to PipelineChecker, further S…
ozankabak 5df5e05
Merge branch 'main' into performance/remove-enforcesorting
ozankabak 119a870
Added logging on tests and ensure timeout
metesynnada a83a284
Robust fifo writing in case of slow executions
metesynnada 8799000
Update fifo.rs
metesynnada 02bd036
Update fifo.rs
metesynnada 05497ef
Update fifo.rs
metesynnada 311a891
Update fifo.rs
metesynnada b051efb
Get rid of locks
metesynnada c0ecbe4
Try exact one batch size
metesynnada e6ab621
Update fifo.rs
metesynnada 7cd6b1e
Update fifo.rs
metesynnada 9fa0c71
Update fifo.rs
metesynnada 0a2d35f
Merge branch 'main' into performance/remove-enforcesorting
metesynnada c9bece5
Ignore FIFO test
metesynnada 417e2f1
Merge branch 'main' into performance/remove-enforcesorting
metesynnada 331851b
Update config.rs
metesynnada f962643
Config update
metesynnada 2c15d4e
Update config.rs
metesynnada 363960f
Update configs.md
metesynnada 61c3434
Update config
metesynnada 7d32c12
Update symmetric_hash_join.rs
metesynnada File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1293,9 +1293,6 @@ impl SessionState { | |
// repartitioning and local sorting steps to meet distribution and ordering requirements. | ||
// Therefore, it should run before EnforceDistribution and EnforceSorting. | ||
Arc::new(JoinSelection::new()), | ||
// Enforce sort before PipelineFixer | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
Arc::new(EnforceDistribution::new()), | ||
Arc::new(EnforceSorting::new()), | ||
// If the query is processing infinite inputs, the PipelineFixer rule applies the | ||
// necessary transformations to make the query runnable (if it is not already runnable). | ||
// If the query can not be made runnable, the rule emits an error with a diagnostic message. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how a symmetric hash join could generate correct results when the inputs don't have any ordering 🤔 Maybe we can add some additional comments about under what circumstances one would enable
/ disable this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SHJ will always produce correct results, but it will use twice as much memory (assuming inputs are of the same size) for no gain except pipelining.
Some more explanation about this option: It is not always possible to detect 100% accurately whether pruning may occur or not -- the system may think pruning is not possible where it is actually possible. Therefore, one would enable this option if they have a-priori knowledge that data would indeed lend itself to pruning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you -- this explanation and the updated comments help to clarify