Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7321][SQL] Add Column expression for conditional statements (when/otherwise) #6072

Closed
wants to merge 8 commits into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented May 12, 2015

This builds on #5932 and should close #5932 as well.

As an example:

df.select(when(df['age'] == 2, 3).otherwise(4).alias("age")).collect()

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32464 has started for PR 6072 at commit 0455eda.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32463/
Test FAILed.

@rxin
Copy link
Contributor Author

rxin commented May 12, 2015

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32466 has started for PR 6072 at commit 0455eda.

@cloud-fan
Copy link
Contributor

How about "case key when" syntax? Like "CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END" in SQL and choose(df("key")).when("a", 1).when("b", 2).else(3).

@rxin
Copy link
Contributor Author

rxin commented May 12, 2015

the one you suggested is more like switch function in C/Java. The one here is just general if/else. We should add that one too, but in separate PR.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32464 has finished for PR 6072 at commit 0455eda.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32464/
Test PASSed.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32466 has finished for PR 6072 at commit 0455eda.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32466/
Test PASSed.

*/
def otherwise(value: Any):Column = this.expr match {
case CaseWhen(branches: Seq[Expression]) =>
CaseWhen(branches :+ lit(value).expr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if user mistakenly call otherwise twice? Then we will build a wrong CaseWhen expression here. Maybe we should create a helper class like what we did for NA functions and aggregate functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It is too heavy weight to have a whole class dedicated to this, but I will add code to throw exceptions if otherwise has been applied previously.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32525 has started for PR 6072 at commit 8f49201.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32525 timed out for PR 6072 at commit 8f49201 after a configured wait of 150m.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32525/
Test FAILed.

@rxin
Copy link
Contributor Author

rxin commented May 12, 2015

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 12, 2015

Test build #32542 has started for PR 6072 at commit 8f49201.

@SparkQA
Copy link

SparkQA commented May 13, 2015

Test build #32542 has finished for PR 6072 at commit 8f49201.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32542/
Test FAILed.

@rxin
Copy link
Contributor Author

rxin commented May 13, 2015

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 13, 2015

Test build #32557 has started for PR 6072 at commit 8f49201.

@SparkQA
Copy link

SparkQA commented May 13, 2015

Test build #32557 has finished for PR 6072 at commit 8f49201.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32557/
Test PASSed.

@asfgit asfgit closed this in 97dee31 May 13, 2015
asfgit pushed a commit that referenced this pull request May 13, 2015
…hen/otherwise)

This builds on #5932 and should close #5932 as well.

As an example:
```python
df.select(when(df['age'] == 2, 3).otherwise(4).alias("age")).collect()
```

Author: Reynold Xin <[email protected]>
Author: kaka1992 <[email protected]>

Closes #6072 from rxin/when-expr and squashes the following commits:

8f49201 [Reynold Xin] Throw exception if otherwise is applied twice.
0455eda [Reynold Xin] Reset run-tests.
bfb9d9f [Reynold Xin] Updated documentation and test cases.
762f6a5 [Reynold Xin] Merge pull request #5932 from kaka1992/IFCASE
95724c6 [kaka1992] Update
8218d0a [kaka1992] Update
801009e [kaka1992] Update
76d6346 [kaka1992] [SPARK-7321][SQL] Add Column expression for conditional statements (if, case)

(cherry picked from commit 97dee31)
Signed-off-by: Reynold Xin <[email protected]>
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…hen/otherwise)

This builds on apache#5932 and should close apache#5932 as well.

As an example:
```python
df.select(when(df['age'] == 2, 3).otherwise(4).alias("age")).collect()
```

Author: Reynold Xin <[email protected]>
Author: kaka1992 <[email protected]>

Closes apache#6072 from rxin/when-expr and squashes the following commits:

8f49201 [Reynold Xin] Throw exception if otherwise is applied twice.
0455eda [Reynold Xin] Reset run-tests.
bfb9d9f [Reynold Xin] Updated documentation and test cases.
762f6a5 [Reynold Xin] Merge pull request apache#5932 from kaka1992/IFCASE
95724c6 [kaka1992] Update
8218d0a [kaka1992] Update
801009e [kaka1992] Update
76d6346 [kaka1992] [SPARK-7321][SQL] Add Column expression for conditional statements (if, case)
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…hen/otherwise)

This builds on apache#5932 and should close apache#5932 as well.

As an example:
```python
df.select(when(df['age'] == 2, 3).otherwise(4).alias("age")).collect()
```

Author: Reynold Xin <[email protected]>
Author: kaka1992 <[email protected]>

Closes apache#6072 from rxin/when-expr and squashes the following commits:

8f49201 [Reynold Xin] Throw exception if otherwise is applied twice.
0455eda [Reynold Xin] Reset run-tests.
bfb9d9f [Reynold Xin] Updated documentation and test cases.
762f6a5 [Reynold Xin] Merge pull request apache#5932 from kaka1992/IFCASE
95724c6 [kaka1992] Update
8218d0a [kaka1992] Update
801009e [kaka1992] Update
76d6346 [kaka1992] [SPARK-7321][SQL] Add Column expression for conditional statements (if, case)
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…hen/otherwise)

This builds on apache#5932 and should close apache#5932 as well.

As an example:
```python
df.select(when(df['age'] == 2, 3).otherwise(4).alias("age")).collect()
```

Author: Reynold Xin <[email protected]>
Author: kaka1992 <[email protected]>

Closes apache#6072 from rxin/when-expr and squashes the following commits:

8f49201 [Reynold Xin] Throw exception if otherwise is applied twice.
0455eda [Reynold Xin] Reset run-tests.
bfb9d9f [Reynold Xin] Updated documentation and test cases.
762f6a5 [Reynold Xin] Merge pull request apache#5932 from kaka1992/IFCASE
95724c6 [kaka1992] Update
8218d0a [kaka1992] Update
801009e [kaka1992] Update
76d6346 [kaka1992] [SPARK-7321][SQL] Add Column expression for conditional statements (if, case)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants