Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21335] [DOC] doc changes for disallowed un-aliased subquery use case #21647

Closed

Conversation

cnZach
Copy link
Contributor

@cnZach cnZach commented Jun 27, 2018

What changes were proposed in this pull request?

Document a change for un-aliased subquery use case, to address the last question in PR #18559:
#18559 (comment)

(Please fill in changes proposed in this fix)

How was this patch tested?

it does not affect tests.

Please review http://spark.apache.org/contributing.html before opening a pull request.

@cnZach
Copy link
Contributor Author

cnZach commented Jun 27, 2018

@viirya @cloud-fan , please kindly help to review. Thanks.

@viirya
Copy link
Member

viirya commented Jun 27, 2018

It is quite a bit long before. Anyway, this document change looks fine to me.

@viirya
Copy link
Member

viirya commented Jun 27, 2018

We don't need the [apache/spark] prefix in the PR title. Can you remove it?

@cloud-fan
Copy link
Contributor

LGTM except the title issue pointed out by @viirya

@cnZach cnZach changed the title [apache/spark] [SPARK-21335] [DOC] doc changes for disallowed un-aliased subquery use case [SPARK-21335] [DOC] doc changes for disallowed un-aliased subquery use case Jun 27, 2018
@cnZach
Copy link
Contributor Author

cnZach commented Jun 27, 2018

okay, changed the PR title. Thanks. @cloud-fan @viirya

@HyukjinKwon
Copy link
Member

ok to test

@@ -2017,6 +2017,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
- Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them.
- The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark uses previous rules, ie. it doesn't adjust the needed scale to represent the values and it returns NULL if an exact representation of the value is not possible.
- In PySpark, `df.replace` does not allow to omit `value` when `to_replace` is not a dictionary. Previously, `value` could be omitted in the other cases and had `None` by default, which is counterintuitive and error-prone.
- Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: `SELECT v.i from (SELECT i FROM v)`. Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery. See [SPARK-20690](https://issues.apache.org/jira/browse/SPARK-20690) and [SPARK-21335](https://issues.apache.org/jira/browse/SPARK-21335) for details.
Copy link
Member

@HyukjinKwon HyukjinKwon Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal but please consider:

Un-aliased subquery is supported by Spark SQL for a long time. Its semantic was not well defined and had confusing behaviors. Since Spark 2.3, we invalid a weird use case: SELECT v.i from (SELECT i FROM v)

->

Un-aliased subquery's semantic has not been well defined with confusing behaviors. Since Spark 2.3, we invalidate such confusing cases, for example, SELECT v.i from (SELECT i FROM v).

Copy link
Member

@HyukjinKwon HyukjinKwon Jun 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also consider:

Now this query will throw analysis exception because users should not be able to use the qualifier inside a subquery.

->

The cases throw an analysis exception now because users should not be able to use the qualifier inside a subquery.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for details. -> for more details.

@SparkQA
Copy link

SparkQA commented Jun 27, 2018

Test build #92370 has finished for PR 21647 at commit c611a11.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cnZach
Copy link
Contributor Author

cnZach commented Jun 27, 2018

@HyukjinKwon updated, thanks.

@SparkQA
Copy link

SparkQA commented Jun 27, 2018

Test build #92371 has finished for PR 21647 at commit bebc3a8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in a1a64e3 Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants