add `try_optimize()` for all rules. #4599

jackwener · 2022-12-13T10:17:10Z

Which issue does this PR close?

Closes #4598.
Followup #4208

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

alamb

Very nice -- thank you @jackwener 🏅

I wonder if after this change we should perhaps simply remove OptimizerRule::optimize to simply the code 🤔

alamb · 2022-12-13T20:07:14Z

datafusion/optimizer/src/decorrelate_where_exists.rs

@@ -107,7 +111,7 @@ impl OptimizerRule for DecorrelateWhereExists {
                }

                // iterate through all exists clauses in predicate, turning each into a join
-                let mut cur_input = (**filter_input).clone();
+                let mut cur_input = filter_input.clone();


alamb · 2022-12-13T20:10:20Z

datafusion/optimizer/src/eliminate_limit.rs

@@ -40,27 +40,45 @@ impl OptimizerRule for EliminateLimit {
    fn optimize(
        &self,
        plan: &LogicalPlan,
-        optimizer_config: &mut OptimizerConfig,
+        _optimizer_config: &mut OptimizerConfig,


I find it strange to use a variable named _something -- I thought the Rust convention was to name variables starting with _ when they were not used 🤔

resolved.
BTW, in the followup PR, I will unify them to make all of them become optimizer_config if possible.

alamb · 2022-12-13T20:12:53Z

cc @andygrove

jackwener · 2022-12-14T04:13:57Z

I wonder if after this change we should perhaps simply remove OptimizerRule::optimize to simply the code 🤔

I prepare to do it in followup-PR

jackwener · 2022-12-14T10:05:18Z

Followup PR

remove all optimize()
avoid every rule must recursive children in optimizer

crepererum · 2022-12-14T10:31:51Z

datafusion/optimizer/src/type_coercion.rs

-        optimize_internal(&DFSchema::empty(), plan)
+        Ok(self
+            .try_optimize(plan, optimizer_config)?
+            .unwrap_or_else(|| plan.clone()))


I've stumbled upon this while working on #4614. I think this is wrong: the type coercion MUST NOT be skipped in case of an error, otherwise we may end up with non-executable plans.

This PR don't change any process logic.
I am not sure about skipped you said.
Current try_optimize alway is some, or maybe I can replace .unwrap_or_else(|| plan.clone())) with unwarp()

And it will be deleted in next PR, so I think it doesn't matter

I think @crepererum is saying that if the type coercion rules fail internally (return an Error) it is likely a serious bug in DataFusion and the plan will not work as expected.

By effectively ignoring the error here, the error will appear at some later stage (e.g. can't run the plan), making it harder to debug the source of the issue

🤔

I get it, there is some misunderstand here.
I felt like this comment isn't related with this PR. So I was a little confused about it.

Here I don't ignore error.

self .try_optimize(plan, optimizer_config)? .unwrap_or_else(|| plan.clone())

error will be return by ?

unwrap_or_else is used for option<>

OK, let's move the discussion / fix to a follow-up: #4615.

jackwener · 2022-12-14T15:14:00Z

#4618 #4619 rely on this PR.
Especially for #4618, I think it's good improvement, and it also is one thing that I wanted to do half a year ago🚀.

andygrove

Thanks for working on this @jackwener. LGTM.

Dandandan · 2022-12-14T17:38:41Z

Thank you @jackwener 😎

ursabot · 2022-12-14T17:42:42Z

Benchmark runs are scheduled for baseline = 0baf5ef and contender = 508ba80. 508ba80 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

github-actions bot added the optimizer Optimizer rules label Dec 13, 2022

jackwener mentioned this pull request Dec 13, 2022

refactor: extract assert_optimized_plan_eq from UT. #4600

Merged

alamb approved these changes Dec 13, 2022

View reviewed changes

add try_optimize() for all rules.

f897127

jackwener force-pushed the try_optimize branch from 44b2580 to f897127 Compare December 14, 2022 04:08

crepererum reviewed Dec 14, 2022

View reviewed changes

crepererum mentioned this pull request Dec 14, 2022

Don't ignore failed optimizer rules #4615

Closed

jackwener mentioned this pull request Dec 14, 2022

Optimizer: avoid every rule must recursive children in optimizer #4618

Merged

andygrove approved these changes Dec 14, 2022

View reviewed changes

Dandandan approved these changes Dec 14, 2022

View reviewed changes

Dandandan merged commit 508ba80 into apache:master Dec 14, 2022

jackwener deleted the try_optimize branch December 25, 2022 14:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `try_optimize()` for all rules. #4599

add `try_optimize()` for all rules. #4599

jackwener commented Dec 13, 2022

alamb left a comment

alamb Dec 13, 2022

alamb Dec 13, 2022

jackwener Dec 14, 2022

alamb commented Dec 13, 2022

jackwener commented Dec 14, 2022 •

edited

Loading

jackwener commented Dec 14, 2022

crepererum Dec 14, 2022

jackwener Dec 14, 2022 •

edited

Loading

alamb Dec 14, 2022

jackwener Dec 14, 2022 •

edited

Loading

crepererum Dec 14, 2022

jackwener commented Dec 14, 2022 •

edited

Loading

andygrove left a comment

Dandandan commented Dec 14, 2022

ursabot commented Dec 14, 2022

add try_optimize() for all rules. #4599

add try_optimize() for all rules. #4599

Conversation

jackwener commented Dec 13, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

alamb left a comment

Choose a reason for hiding this comment

alamb Dec 13, 2022

Choose a reason for hiding this comment

alamb Dec 13, 2022

Choose a reason for hiding this comment

jackwener Dec 14, 2022

Choose a reason for hiding this comment

alamb commented Dec 13, 2022

jackwener commented Dec 14, 2022 • edited Loading

jackwener commented Dec 14, 2022

crepererum Dec 14, 2022

Choose a reason for hiding this comment

jackwener Dec 14, 2022 • edited Loading

Choose a reason for hiding this comment

alamb Dec 14, 2022

Choose a reason for hiding this comment

jackwener Dec 14, 2022 • edited Loading

Choose a reason for hiding this comment

crepererum Dec 14, 2022

Choose a reason for hiding this comment

jackwener commented Dec 14, 2022 • edited Loading

andygrove left a comment

Choose a reason for hiding this comment

Dandandan commented Dec 14, 2022

ursabot commented Dec 14, 2022

add `try_optimize()` for all rules. #4599

add `try_optimize()` for all rules. #4599

jackwener commented Dec 14, 2022 •

edited

Loading

jackwener Dec 14, 2022 •

edited

Loading

jackwener Dec 14, 2022 •

edited

Loading

jackwener commented Dec 14, 2022 •

edited

Loading