[SPARK-36533][SS][FOLLOWUP] Support Trigger.AvailableNow in PySpark #34592

HeartSaVioR · 2021-11-14T23:03:47Z

What changes were proposed in this pull request?

This PR proposes to add Trigger.AvailableNow in PySpark on top of #33763.

Why are the changes needed?

We missed adding Trigger.AvailableNow in PySpark in #33763.

Does this PR introduce any user-facing change?

Yes, Trigger.AvailableNow will be available in PySpark as well.

How was this patch tested?

Added simple validation in PySpark doc. Manually tested as below:

>>> spark.readStream.format("text").load("/WorkArea/ScalaProjects/spark-apache/dist/inputs").writeStream.format("console").trigger(once=True).start()
<pyspark.sql.streaming.StreamingQuery object at 0x118dff6d0>
-------------------------------------------
Batch: 0
-------------------------------------------
+-----+
|value|
+-----+
|    a|
|    b|
|    c|
|    d|
|    e|
+-----+


>>> spark.readStream.format("text").load("/WorkArea/ScalaProjects/spark-apache/dist/inputs").writeStream.format("console").trigger(availableNow=True).start()
<pyspark.sql.streaming.StreamingQuery object at 0x118dffe50>
>>> -------------------------------------------
Batch: 0
-------------------------------------------
+-----+
|value|
+-----+
|    a|
|    b|
|    c|
|    d|
|    e|
+-----+


>>> spark.readStream.format("text").option("maxfilespertrigger", "2").load("/WorkArea/ScalaProjects/spark-apache/dist/inputs").writeStream.format("console").trigger(availableNow=True).start()
<pyspark.sql.streaming.StreamingQuery object at 0x118dff820>
>>> -------------------------------------------
Batch: 0
-------------------------------------------
+-----+
|value|
+-----+
|    a|
|    b|
+-----+

-------------------------------------------
Batch: 1
-------------------------------------------
+-----+
|value|
+-----+
|    c|
|    d|
+-----+

-------------------------------------------
Batch: 2
-------------------------------------------
+-----+
|value|
+-----+
|    e|
+-----+

>>>

HeartSaVioR · 2021-11-14T23:08:28Z

cc. @srowen @brkyvz @viirya @xuanyuanking @bozhang2820
Also cc. @HyukjinKwon to double-check I don't miss any rules specific to PySpark side.

SparkQA · 2021-11-14T23:48:17Z

Test build #145212 has finished for PR 34592 at commit 1f795f4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR · 2021-11-14T23:57:58Z

Thanks for the quick review! Merging to master.

SparkQA · 2021-11-15T00:16:27Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49681/

SparkQA · 2021-11-15T01:19:27Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49681/

HeartSaVioR · 2021-11-15T03:11:46Z

I forgot about SS guide doc; there're code examples on trigger in PySpark. I'll need to update the doc as well. Crafting another follow-up PR sooner...

HeartSaVioR · 2021-11-15T04:00:10Z

#34597

Support Trigger.AvailableNow in PySpark

1f795f4

github-actions bot added CORE PYTHON SQL STRUCTURED STREAMING labels Nov 14, 2021

HyukjinKwon approved these changes Nov 14, 2021

View reviewed changes

viirya approved these changes Nov 14, 2021

View reviewed changes

HeartSaVioR closed this in edbc7cf Nov 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-36533][SS][FOLLOWUP] Support Trigger.AvailableNow in PySpark #34592

[SPARK-36533][SS][FOLLOWUP] Support Trigger.AvailableNow in PySpark #34592

HeartSaVioR commented Nov 14, 2021 •

edited

Loading

HeartSaVioR commented Nov 14, 2021

SparkQA commented Nov 14, 2021

HeartSaVioR commented Nov 14, 2021

SparkQA commented Nov 15, 2021

SparkQA commented Nov 15, 2021

HeartSaVioR commented Nov 15, 2021

HeartSaVioR commented Nov 15, 2021

[SPARK-36533][SS][FOLLOWUP] Support Trigger.AvailableNow in PySpark #34592

[SPARK-36533][SS][FOLLOWUP] Support Trigger.AvailableNow in PySpark #34592

Conversation

HeartSaVioR commented Nov 14, 2021 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

HeartSaVioR commented Nov 14, 2021

SparkQA commented Nov 14, 2021

HeartSaVioR commented Nov 14, 2021

SparkQA commented Nov 15, 2021

SparkQA commented Nov 15, 2021

HeartSaVioR commented Nov 15, 2021

HeartSaVioR commented Nov 15, 2021

HeartSaVioR commented Nov 14, 2021 •

edited

Loading