-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the config datafusion.execution.coalesce_target_batch_size and use datafusion.execution.batch_size instead #4757
Conversation
Hi @alamb, @andygrove, could you help review this PR? |
… use datafusion.execution.batch_size instead
0e86a8d
to
c37b6ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change makes a lot of sense. Thank you @yahoNanJing
Maybe @Dandandan remembers why there were two different batch sizes
.try_into() | ||
.unwrap(), | ||
))); | ||
if config.coalesce_batches() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That reads much more nicely 👍
I plan to merge this later today unless there are any more comments. |
Since there's no opposite comments, I will merge this PR. |
Benchmark runs are scheduled for baseline = ac876db and contender = f7477dc. f7477dc is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #4756.
Rationale for this change
As #4756 mentioned, there's already a config of
datafusion.execution.batch_size
to indicate the row number of a batch for processing. I don't think it's necessary to make a different batch size for the CoalesceBatches rule.What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?