[SPARK-16727][SparkR] Fix expected test output of describe and summary functions #14357

junyangq · 2016-07-26T01:08:34Z

What changes were proposed in this pull request?

Fix expected test output of describe and summary functions. String columns are not summarized in the output.

How was this patch tested?

SparkR unit test.

shivaram · 2016-07-26T01:31:39Z

@junyangq Any idea how the tests were passing on Jenkins before this fix ?

shivaram · 2016-07-26T01:38:37Z

I think this is related to 142df48

cc @dongjoon-hyun

SparkQA · 2016-07-26T01:41:49Z

Test build #62852 has finished for PR 14357 at commit f7d02ce.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2016-07-26T02:16:06Z

Hi, @shivaram.
Sorry, but what is the problem?

dongjoon-hyun · 2016-07-26T02:19:56Z

String columns works for min and max summarization.

junyangq · 2016-07-26T02:21:35Z

@shivaram I'm also curious...The issue arised when I ran the test locally.

dongjoon-hyun · 2016-07-26T02:23:37Z

> collect(describe(read.json('examples/src/main/resources/people.json')))
  summary                age    name
1   count                  2       3
2    mean               24.5    <NA>
3  stddev 7.7781745930520225    <NA>
4     min                 19    Andy
5     max                 30 Michael

dongjoon-hyun · 2016-07-26T02:24:18Z

In fact, count, min, and max are supported officially for string in all Scala/Python/R.

dongjoon-hyun · 2016-07-26T02:25:58Z

Hi, @junyangq .
Could you give me some pointer to understand the current context? :)

junyangq · 2016-07-26T02:26:18Z

@dongjoon-hyun That makes sense. That's why I feel confused about the output on my local machine...

> collect(describe(read.json('examples/src/main/resources/people.json')))
  summary                age                                                    
1   count                  2
2    mean               24.5
3  stddev 7.7781745930520225
4     min                 19
5     max                 30

dongjoon-hyun · 2016-07-26T02:31:49Z

I see. Which version is it?

dongjoon-hyun · 2016-07-26T02:33:51Z

I see. That seems 2.0 branch.

dongjoon-hyun · 2016-07-26T02:34:43Z

Hi, @shivaram .
It's due to the branch difference.
The PR is merged to the master only 17 days ago.

Thanks - merging in master.

junyangq · 2016-07-26T02:39:46Z

I think I ran the test under the master branch though... Let me double check :)

dongjoon-hyun · 2016-07-26T02:49:11Z

Thank you, @junyangq .

junyangq · 2016-07-26T03:03:21Z

I merged the most recent master branch, rebuilt and installed the package, but the test failed at the same place. @dongjoon-hyun

dongjoon-hyun · 2016-07-26T03:24:14Z

Oh, let me try this time. Thank you for double-checking, @junyangq .

dongjoon-hyun · 2016-07-26T03:32:54Z

@junyangq . Currently, you are at the most recent master build, right?
And, SparkR does not show the name column as a result of describe command.
I'm wondering the result of spark-shell on your systems. Could you run the following command in spark-shell?

scala> spark.read.json("examples/src/main/resources/people.json").describe().show()

junyangq · 2016-07-26T03:42:55Z

Hmm... It doesn't show the name column either.

dongjoon-hyun · 2016-07-26T03:43:05Z

At the most recent master build, I did the following things and all tests are passed.

$ git clean -fdx
$ ./build/sbt -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive-thriftserver -Phive -Psparkr package streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly streaming-kinesis-asl-assembly/assembly
$ R/install-dev.sh 
$ R/run-tests.sh 
...
DONE ===========================================================================
Tests passed.

dongjoon-hyun · 2016-07-26T03:43:26Z

I think you did something wrong. :)
Could you close this PR?

dongjoon-hyun · 2016-07-26T03:46:09Z

Actually, you can see the Jenkins log, too. There is no problem with the current R testsuite.
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62858/consoleFull

FYI, I'm using JDK 1.8.0_102 and Jenkins is using JDK 1.7.x.

junyangq · 2016-07-26T03:46:23Z

Yeah sure, but just wondering if the clean and additional arguments something we should normally do?

dongjoon-hyun · 2016-07-26T03:47:30Z

Never. I did that in order to make it sure. I don't do it frequently.
It is for you. :)

dongjoon-hyun · 2016-07-26T03:49:06Z

If you use -fdx, it removes IDE (like Intellij) settings, too.

junyangq · 2016-07-26T04:00:35Z

I see. Thank you for pointing that out :) I'll close the PR.

dongjoon-hyun · 2016-07-26T04:02:47Z

It's my pleasure. See you later around Apache Spark. :)

fix expected output in test block of describe and summary functions

f7d02ce

junyangq closed this Jul 26, 2016

shivaram mentioned this pull request Jul 28, 2016

[Spark-16579][SparkR] add install.spark function #14258

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-16727][SparkR] Fix expected test output of describe and summary functions #14357

[SPARK-16727][SparkR] Fix expected test output of describe and summary functions #14357

junyangq commented Jul 26, 2016

shivaram commented Jul 26, 2016

shivaram commented Jul 26, 2016

SparkQA commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016 •

edited

Loading

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016 •

edited

Loading

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016 •

edited

Loading

dongjoon-hyun commented Jul 26, 2016 •

edited

Loading

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

[SPARK-16727][SparkR] Fix expected test output of describe and summary functions #14357

[SPARK-16727][SparkR] Fix expected test output of describe and summary functions #14357

Conversation

junyangq commented Jul 26, 2016

What changes were proposed in this pull request?

How was this patch tested?

shivaram commented Jul 26, 2016

shivaram commented Jul 26, 2016

SparkQA commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016 • edited Loading

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016 • edited Loading

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016 • edited Loading

dongjoon-hyun commented Jul 26, 2016 • edited Loading

junyangq commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016

dongjoon-hyun commented Jul 26, 2016 •

edited

Loading

junyangq commented Jul 26, 2016 •

edited

Loading

dongjoon-hyun commented Jul 26, 2016 •

edited

Loading

dongjoon-hyun commented Jul 26, 2016 •

edited

Loading