Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26865][SQL] DataSourceV2Strategy should push normalized filters #23770

Closed
wants to merge 3 commits into from
Closed

[SPARK-26865][SQL] DataSourceV2Strategy should push normalized filters #23770

wants to merge 3 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Feb 13, 2019

What changes were proposed in this pull request?

This PR aims to make DataSourceV2Strategy normalize filters like FileSourceStrategy when it pushes them into SupportsPushDownFilters.pushFilters.

How was this patch tested?

Pass the Jenkins with the newly added test case.

@dongjoon-hyun
Copy link
Member Author

Could you review this, @cloud-fan , @gatorsmile , @hvanhovell , @gengliangwang ?

@cloud-fan
Copy link
Contributor

good catch! LGTM.

@dongjoon-hyun
Copy link
Member Author

Thank you for review, @cloud-fan !

@SparkQA
Copy link

SparkQA commented Feb 13, 2019

Test build #102278 has finished for PR 23770 at commit 2b5a086.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 13, 2019

Test build #102279 has finished for PR 23770 at commit 3169bac.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@@ -219,6 +219,11 @@ class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
IsNotNull(attrInt))), None)
}

test("SPARK-26865 DataSourceV2Strategy should push normalized filters") {
val attrInt = 'cint.int
DataSourceStrategy.normalizeFilters(Seq(IsNotNull(attrInt.withName("CiNt"))), Seq(attrInt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check the result of normalizeFilters? It seems that the test case will always pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Right. Thanks!

@gengliangwang
Copy link
Member

Also, it would be great to have regression test cases like the one in the description of https://issues.apache.org/jira/browse/SPARK-26865

@SparkQA
Copy link

SparkQA commented Feb 13, 2019

Test build #102282 has finished for PR 23770 at commit 3169bac.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Feb 13, 2019

@gengliangwang . For the second comment, I had the test case at the first commit, but this issue is not about ORC at all, so I changed it. It looks misleading.

2b5a086#diff-5a2e7f03d14856c8769fd3ddea8742bdR2970

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Feb 13, 2019

Test build #102298 has finished for PR 23770 at commit 74d16de.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Retest this please.

@SparkQA
Copy link

SparkQA commented Feb 13, 2019

Test build #102305 has finished for PR 23770 at commit 74d16de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Thank you, @cloud-fan , @viirya , @gengliangwang , @HyukjinKwon .
Merged to master.

jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

This PR aims to make `DataSourceV2Strategy` normalize filters like [FileSourceStrategy](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L150-L158) when it pushes them into `SupportsPushDownFilters.pushFilters`.

## How was this patch tested?

Pass the Jenkins with the newly added test case.

Closes apache#23770 from dongjoon-hyun/SPARK-26865.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun dongjoon-hyun deleted the SPARK-26865 branch February 18, 2019 16:27
mccheah pushed a commit to palantir/spark that referenced this pull request May 15, 2019
## What changes were proposed in this pull request?

This PR aims to make `DataSourceV2Strategy` normalize filters like [FileSourceStrategy](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L150-L158) when it pushes them into `SupportsPushDownFilters.pushFilters`.

## How was this patch tested?

Pass the Jenkins with the newly added test case.

Closes apache#23770 from dongjoon-hyun/SPARK-26865.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
rahij pushed a commit to palantir/spark that referenced this pull request Feb 27, 2020
## What changes were proposed in this pull request?

This PR aims to make `DataSourceV2Strategy` normalize filters like [FileSourceStrategy](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L150-L158) when it pushes them into `SupportsPushDownFilters.pushFilters`.

## How was this patch tested?

Pass the Jenkins with the newly added test case.

Closes apache#23770 from dongjoon-hyun/SPARK-26865.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
robert3005 pushed a commit to palantir/spark that referenced this pull request Mar 9, 2020
## What changes were proposed in this pull request?

This PR aims to make `DataSourceV2Strategy` normalize filters like [FileSourceStrategy](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala#L150-L158) when it pushes them into `SupportsPushDownFilters.pushFilters`.

## How was this patch tested?

Pass the Jenkins with the newly added test case.

Closes apache#23770 from dongjoon-hyun/SPARK-26865.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants