Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-8823] Ban update from updating primary key and partition key #12587

Merged
merged 4 commits into from
Jan 14, 2025

Conversation

Davis-Zhang-Onehouse
Copy link
Contributor

Change Logs

update will error out when it tries to change partition column /primary key column value.

Impact

guard update against unsupported use cases.

Risk level (write none, low medium or high below)

none

Documentation Update

Maybe we should update the update query doc on this behavior.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Jan 7, 2025
@Davis-Zhang-Onehouse Davis-Zhang-Onehouse force-pushed the HUDI-8823 branch 4 times, most recently from e2db61c to 3cdd40d Compare January 9, 2025 02:18
Comment on lines 404 to 410
assert(e1.getMessage.contains("Detected disallowed assignment clause in UPDATE statement for record key field `id`"))

// Try to update partition column (should fail)
val e2 = intercept[AnalysisException] {
spark.sql(s"update $tableName set pt = '2022' where id = 1")
}
assert(e2.getMessage.contains("Detected update query with disallowed assignment clause for partition field `pt`"))
assert(e2.getMessage.contains("Detected disallowed assignment clause in UPDATE statement for partition field `pt`"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add . Please remove the assignment clause to avoid the error.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

spark.sql(s"update $tableName set id = 2 where id = 1")
}
assert(e1.getMessage.contains(s"Detected disallowed assignment clause in UPDATE statement for record key field `id`" +
s" for table `spark_catalog.default.$tableName`. Please remove the assignment clause to avoid the error."))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Spark add spark_catalog. as the table name prefix? Could that be remove to avoid confusing users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is good to stick to full qualifiers than just table name, since this causes 0 ambiguity. If given this fact we still think table name alone is better, I can fix

Copy link
Contributor Author

@Davis-Zhang-Onehouse Davis-Zhang-Onehouse Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for other loggings and user messages of SQL, we are consistently sticking to this format. The spark_catalog. is the catalog name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sg. Could you check the test failures on Spark 3.4?

@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua yihua merged commit 6b4c8f3 into apache:master Jan 14, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:M PR with lines of changes in (100, 300]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants