Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38030][SQL] Canonicalization should not remove nullability of AttributeReference dataType #35332

Closed
wants to merge 1 commit into from

Conversation

shardulm94
Copy link
Contributor

@shardulm94 shardulm94 commented Jan 26, 2022

What changes were proposed in this pull request?

Canonicalization of AttributeReference should not remove nullability information of its dataType.

Why are the changes needed?

SPARK-38030 lists an issue where canonicalization of cast resulted in an unresolved expression, thus causing query failure. The issue was that the child AttributeReference's dataType was converted to nullable during canonicalization and hence the Cast's checkInputDataTypes fails. Although the exact repro listed in SPARK-38030 no longer works in master due to an unrelated change (details in the JIRA), some other codepaths which depend on canonicalized representations can trigger the same issue.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added unit test to ensure that canonicalization preserves nullability of AttributeReference and does not result in an unresolved cast

@github-actions github-actions bot added the SQL label Jan 26, 2022
@shardulm94
Copy link
Contributor Author

cc: @cloud-fan @viirya @HyukjinKwon

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

assert(cast.resolved)
// canonicalization should not converted resolved cast to unresolved
assert(cast.canonicalized.resolved)
assert(cast.canonicalized.dataType == structType.asNullable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use triple-equals here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

TryCast(attr, structType))) {
assert(cast.resolved)
// canonicalization should not converted resolved cast to unresolved
assert(cast.canonicalized.resolved)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be asserting on cast.canonicalized.output as well? this is how the original issue was detected, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original issue was detected because of the following plan UnionExec -> ProjectExec --> Alias --> Cast --> AttributeReference and a call to .output on UnionExec which in turn calls ProjectExec.output.nullable. I can recreate this chain here, but not sure if thats useful. The root issue was that a resolved node was being converted to unresolved which we are testing here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah I think the current test is probably sufficient.

// When this cast involves TimeZone, it's only resolved if the timeZoneId is set;
// Otherwise behave like Expression.resolved.
override lazy val resolved: Boolean =
childrenResolved && checkInputDataTypes().isSuccess && (!needsTimeZone || timeZoneId.isDefined)

override lazy val preCanonicalized: Expression = {
val basic = withNewChildren(Seq(child.preCanonicalized)).asInstanceOf[CastBase]
.withDataType(dataType.asNullable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a comment to say that this matches the canonicaliztion of AttributeReference

Copy link
Member

@viirya viirya Jan 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that other kind of child's canonicalized format doesn't follow the canonicaliztion of AttributeReference on nullability change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I can't recall why AttributeReference needs to remove nullability when canonicalization. Maybe we should change AttributeReference instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why AttributeReference needs to remove nullability either, however I don't have much domain knowledge here, so I opted for a safer change to only fix Cast. There can be other Cast like expressions which do not handle removing nullability of their dataType params.

I had made a test run a few days ago for preserving nullability in AttributeReference during canonicalization and the existing tests passed. I don't think making this change should lead to failures, but it may lead to some lost optimization? (From what I can see, Canonicalization is used to compare to query plans, and with .asNullable, it is more likely that plans will match)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the other hand, I think it's safer to not remove nullability when comparing query plans. Maybe two plans are truly different because the nullability is different and the data is also different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think its probably more deviation in behavior, but safer overall. I will update this PR soon.

@shardulm94 shardulm94 changed the title [SPARK-38030][SQL] Canonicalization of cast should remove nullability of target dataType [SPARK-38030][SQL] Canonicalization should not remove nullability of AttributeReference dataType Jan 28, 2022
@shardulm94
Copy link
Contributor Author

@cloud-fan @viirya @xkrogen Thanks for the reviews! I have updated the PR with the new approach of preserving nullability during canonicalization of AttributeReference.

@shardulm94
Copy link
Contributor Author

Hi folks, can someone take another look at the PR?

@cloud-fan
Copy link
Contributor

LGTM, also cc @sigmod

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 2e703ae Feb 8, 2022
Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm too

dongjoon-hyun pushed a commit that referenced this pull request Feb 9, 2022
…y of AttributeReference dataType

This is a backport of #35332 to branch 3.2

### What changes were proposed in this pull request?
Canonicalization of AttributeReference should not remove nullability information of its dataType.

### Why are the changes needed?
SPARK-38030 lists an issue where canonicalization of cast resulted in an unresolved expression, thus causing query failure. The issue was that the child AttributeReference's dataType was converted to nullable during canonicalization and hence the Cast's `checkInputDataTypes` fails. Although the exact repro listed in SPARK-38030 no longer works in branch-3.2 due to an unrelated change (details in the JIRA), some other codepaths which depend on canonicalized representations can trigger the same issue.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added unit test to ensure that canonicalization preserves nullability of AttributeReference and does not result in an unresolved cast

Closes #35446 from shardulm94/SPARK-38030-3.2.

Authored-by: Shardul Mahadik <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Feb 9, 2022
…y of AttributeReference dataType

This is a backport of #35332 to branch 3.1

### What changes were proposed in this pull request?
Canonicalization of AttributeReference should not remove nullability information of its dataType.

### Why are the changes needed?
SPARK-38030 lists an issue where canonicalization of cast resulted in an unresolved expression, thus causing query failure. The issue was that the child AttributeReference's dataType was converted to nullable during canonicalization and hence the Cast's `checkInputDataTypes` fails.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added unit test to ensure that canonicalization preserves nullability of AttributeReference and does not result in an unresolved cast. Also added a test to ensure that the issue observed in SPARK-38030 (interaction of this bug with AQE) is fixed. This test/repro only works on 3.1 because the code which triggers access on an unresolved object is [lazy](https://github.com/apache/spark/blob/7e5c3b216431b6a5e9a0786bf7cded694228cdee/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala#L132) in 3.2+ and hence does not trigger the issue in 3.2+.

Closes #35444 from shardulm94/SPARK-38030-3.1.

Authored-by: Shardul Mahadik <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…y of AttributeReference dataType

This is a backport of apache#35332 to branch 3.2

### What changes were proposed in this pull request?
Canonicalization of AttributeReference should not remove nullability information of its dataType.

### Why are the changes needed?
SPARK-38030 lists an issue where canonicalization of cast resulted in an unresolved expression, thus causing query failure. The issue was that the child AttributeReference's dataType was converted to nullable during canonicalization and hence the Cast's `checkInputDataTypes` fails. Although the exact repro listed in SPARK-38030 no longer works in branch-3.2 due to an unrelated change (details in the JIRA), some other codepaths which depend on canonicalized representations can trigger the same issue.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added unit test to ensure that canonicalization preserves nullability of AttributeReference and does not result in an unresolved cast

Closes apache#35446 from shardulm94/SPARK-38030-3.2.

Authored-by: Shardul Mahadik <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit d62735d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants