-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21335] [DOC] doc changes for disallowed un-aliased subquery use case #21647
[SPARK-21335] [DOC] doc changes for disallowed un-aliased subquery use case #21647
Conversation
@viirya @cloud-fan , please kindly help to review. Thanks. |
It is quite a bit long before. Anyway, this document change looks fine to me. |
We don't need the |
LGTM except the title issue pointed out by @viirya |
okay, changed the PR title. Thanks. @cloud-fan @viirya |
ok to test |
docs/sql-programming-guide.md
Outdated
@@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see | |||
- Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them. | |||
- The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible. | |||
- In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone. | |||
- Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal but please consider:
Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: SELECT v.i from (SELECT i FROM v)
->
Un-aliased subquery's semantic has not been well defined with confusing behaviors. Since Spark 2.3, we invalidate such confusing cases, for example, SELECT v.i from (SELECT i FROM v)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also consider:
Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery.
->
The cases throw an analysis exception now because users should not be able to use the qualifier inside a subquery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for details. -> for more details.
Test build #92370 has finished for PR 21647 at commit
|
@HyukjinKwon updated, thanks. |
Test build #92371 has finished for PR 21647 at commit
|
thanks, merging to master! |
What changes were proposed in this pull request?
Document a change for un-aliased subquery use case, to address the last question in PR #18559:
#18559 (comment)
(Please fill in changes proposed in this fix)
How was this patch tested?
it does not affect tests.
Please review http://spark.apache.org/contributing.html before opening a pull request.