[SPARK-16114] [SQL] structured streaming network word count examples #13816

jjthomas · 2016-06-21T21:07:13Z

What changes were proposed in this pull request?

Network word count example for structured streaming

How was this patch tested?

Run locally

tdas · 2016-06-21T22:04:07Z

...es/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredNetworkWordCount.java

+      .appName("JavaStructuredNetworkWordCount")
+      .getOrCreate();
+
+    Dataset<String> df = spark.readStream().format("socket").option("host", args[0])


make this

spark .readStream() .format("socket") .option(...) ....

easier to read.

tdas · 2016-06-21T22:09:40Z

ok to test.

tdas · 2016-06-21T22:13:43Z

...ples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredNetworkWordCount.scala

+
+    import spark.implicits._
+
+    val df = spark.readStream


df --> lines

SparkQA · 2016-06-21T22:14:27Z

Test build #60970 has finished for PR 13816 at commit 38b5497.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- public class JavaStructuredNetworkWordCount

jjthomas · 2016-06-21T23:40:13Z

Responded to comments

SparkQA · 2016-06-21T23:45:10Z

Test build #60981 has finished for PR 13816 at commit 18c83b1.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- public final class JavaStructuredNetworkWordCount

tdas · 2016-06-22T01:07:23Z

...es/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredNetworkWordCount.java

+      .getOrCreate();
+
+    // input lines (may be multiple words on each line)
+    Dataset<String> lines = spark


you dont need to convert to Dataset[String] using as, since you are not using the typed groupByKey. keeping as Dataset[Row] is fine, as you done with the scala and python version.

SparkQA · 2016-06-22T17:29:41Z

Test build #61051 has finished for PR 13816 at commit 46ac930.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-22T18:04:03Z

Test build #61054 has finished for PR 13816 at commit 80fee20.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2016-06-22T21:46:11Z

test this again

SparkQA · 2016-06-22T22:13:47Z

Test build #3127 has finished for PR 13816 at commit 80fee20.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-22T22:19:12Z

Test build #3128 has finished for PR 13816 at commit 80fee20.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-22T22:41:01Z

Test build #61071 has finished for PR 13816 at commit f7aec9d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-27T18:07:45Z

Test build #61310 has finished for PR 13816 at commit c3b16a2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2016-06-28T07:55:34Z

examples/src/main/scala/org/apache/spark/examples/sql/streaming/NetworkEventTimeWindow.scala

+ *    `$ bin/run-example org.apache.spark.examples.sql.streaming.EventTimeWindowExample
+ *    localhost 9999 <checkpoint dir>`
+ */
+object NetworkEventTimeWindow {


Just rename to EventTimeWindow.

tdas · 2016-06-28T07:56:08Z

...es/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredNetworkWordCount.java

+ * To run this on your local machine, you need to first run a Netcat server
+ *    `$ nc -lk 9999`
+ * and then run the example
+ *    `$ bin/run-example org.apache.spark.examples.sql.streaming.JavaStructuredNetworkWordCount


I think you can just do $ bin/run-example sql.streaming.JavaStructuredNetworkWordCount. Verify that, and if it works, please change it.

SparkQA · 2016-06-28T17:22:07Z

Test build #61389 has finished for PR 13816 at commit fb491c6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-28T21:50:52Z

Test build #61413 has finished for PR 13816 at commit 6ab4453.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tdas · 2016-06-28T23:05:19Z

examples/src/main/python/sql/streaming/structured_network_wordcount.py

+
+if __name__ == "__main__":
+    if len(sys.argv) != 3:
+        print("Usage: network_wordcount.py <hostname> <port>", file=sys.stderr)


usage has wrong name.

tdas · 2016-06-28T23:11:43Z

LGTM. Merging this to master and 2.0. Thank @jjthomas

## What changes were proposed in this pull request? Network word count example for structured streaming ## How was this patch tested? Run locally Author: James Thomas <[email protected]> Author: James Thomas <[email protected]> Closes #13816 from jjthomas/master. (cherry picked from commit 3554713) Signed-off-by: Tathagata Das <[email protected]>

SparkQA · 2016-06-29T01:29:47Z

Test build #61421 has finished for PR 13816 at commit a8c3fec.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

structured streaming network word count examples

38b5497

tdas reviewed Jun 21, 2016
View reviewed changes

responded to TD's comments

18c83b1

tdas reviewed Jun 22, 2016
View reviewed changes

responded to more comments

46ac930

fixed python lint

80fee20

small fixes

f7aec9d

New example

c3b16a2

tdas reviewed Jun 28, 2016
View reviewed changes

addressed comments

fb491c6

jjthomas added 3 commits June 28, 2016 14:00

Merge remote-tracking branch 'upstream/master' into pr0

ca3fd5c

fixes

eb5744b

remove unneeded file

6ab4453

tdas reviewed Jun 28, 2016
View reviewed changes

small fix

a8c3fec

asfgit closed this in 3554713 Jun 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-16114] [SQL] structured streaming network word count examples #13816

[SPARK-16114] [SQL] structured streaming network word count examples #13816

jjthomas commented Jun 21, 2016

tdas Jun 21, 2016

tdas commented Jun 21, 2016

tdas Jun 21, 2016

SparkQA commented Jun 21, 2016

jjthomas commented Jun 21, 2016

SparkQA commented Jun 21, 2016

tdas Jun 22, 2016 •

edited

Loading

SparkQA commented Jun 22, 2016

SparkQA commented Jun 22, 2016

tdas commented Jun 22, 2016

SparkQA commented Jun 22, 2016

SparkQA commented Jun 22, 2016

SparkQA commented Jun 22, 2016

SparkQA commented Jun 27, 2016

tdas Jun 28, 2016

tdas Jun 28, 2016

SparkQA commented Jun 28, 2016

SparkQA commented Jun 28, 2016

tdas Jun 28, 2016 •

edited

Loading

tdas commented Jun 28, 2016

SparkQA commented Jun 29, 2016

[SPARK-16114] [SQL] structured streaming network word count examples #13816

[SPARK-16114] [SQL] structured streaming network word count examples #13816

Conversation

jjthomas commented Jun 21, 2016

What changes were proposed in this pull request?

How was this patch tested?

tdas Jun 21, 2016

Choose a reason for hiding this comment

tdas commented Jun 21, 2016

tdas Jun 21, 2016

Choose a reason for hiding this comment

SparkQA commented Jun 21, 2016

jjthomas commented Jun 21, 2016

SparkQA commented Jun 21, 2016

tdas Jun 22, 2016 • edited Loading

Choose a reason for hiding this comment

SparkQA commented Jun 22, 2016

SparkQA commented Jun 22, 2016

tdas commented Jun 22, 2016

SparkQA commented Jun 22, 2016

SparkQA commented Jun 22, 2016

SparkQA commented Jun 22, 2016

SparkQA commented Jun 27, 2016

tdas Jun 28, 2016

Choose a reason for hiding this comment

tdas Jun 28, 2016

Choose a reason for hiding this comment

SparkQA commented Jun 28, 2016

SparkQA commented Jun 28, 2016

tdas Jun 28, 2016 • edited Loading

Choose a reason for hiding this comment

tdas commented Jun 28, 2016

SparkQA commented Jun 29, 2016

tdas Jun 22, 2016 •

edited

Loading

tdas Jun 28, 2016 •

edited

Loading