Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23162][PySpark][ML] Add r2adj into Python API in LinearRegressionSummary #20842

Closed
wants to merge 3 commits into from

Conversation

kevinyu98
Copy link
Contributor

What changes were proposed in this pull request?

Adding r2adj in LinearRegressionSummary for Python API.

How was this patch tested?

Added unit tests to exercise the api calls for the summary classes in tests.py.

@BryanCutler
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Mar 21, 2018

Test build #88492 has finished for PR 20842 at commit 12d18b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@BryanCutler BryanCutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kevinyu98! Looks mostly good, just some small doc formatting issues

@since("2.4.0")
def r2adj(self):
"""
Returns Adjusted R^2^, the adjusted coefficient of determination.
Copy link
Member

@BryanCutler BryanCutler Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R^2^ is a scaladoc format. How about just R^2? and could you fix in def r2 also

(R^2 would mean XOR in python, so R**2 seems the best to me unless someone knows a better format to use with Sphinx)
edit again, let's just stick with R^2, I think it's fine here and no one would think it's XOR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I will change.

Returns Adjusted R^2^, the adjusted coefficient of determination.

.. seealso:: `Wikipedia coefficient of determination \
<https://en.wikipedia.org/wiki/Coefficient_of_determination>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the same link from the scaladoc https://en.wikipedia.org/wiki/Coefficient_of_determination#Adjusted_R2

It also doesn't generate properly, you need a trailing _ I believe (also need to fix def r2 if you don't mind).

it should be:

.. seealso:: `Wikipedia coefficient of determination \
<http://en.wikipedia.org/wiki/Coefficient_of_determination#Adjusted_R2>`_

If you are able to, building the docs to check these is a good idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, done.

@SparkQA
Copy link

SparkQA commented Mar 24, 2018

Test build #88556 has finished for PR 20842 at commit ab0b04d.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tengpeng
Copy link
Contributor

Looks good! Thanks!

@kevinyu98
Copy link
Contributor Author

kevinyu98 commented Mar 26, 2018

@tengpeng Thanks a lot.Not sure the Python style test fail is related to my latest code change in regression.py or not.
I am using this command to run the python style tests locally by using ./dev/lint-python, but got lots of warning from the python docs folder:
Warning, treated as error:
/Users/qianyangyu/IdeaProjects/spark/python/docs/pyspark.rst:18: (WARNING/2) autodoc: failed to import module u'pyspark'; the following exception was raised:
Traceback (most recent call last):
File "/usr/local/Cellar/sphinx-doc/1.6.3_1/libexec/lib/python2.7/site-packages/sphinx/ext/autodoc.py", line 657, in import_object
import(self.modname)
ImportError: No module named pyspark
make: *** [html] Error 1
...

"""
Returns Adjusted R^2, the adjusted coefficient of determination.

.. seealso:: `Wikipedia coefficient of determination \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the error was because the link name is the same as above, maybe try Wikipedia coefficient of determination, Adjusted R^2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks.

@SparkQA
Copy link

SparkQA commented Mar 26, 2018

Test build #88597 has finished for PR 20842 at commit 596f58a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@BryanCutler BryanCutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BryanCutler
Copy link
Member

merged to master, thanks @kevinyu98 !

@asfgit asfgit closed this in 3e778f5 Mar 26, 2018
mshtelma pushed a commit to mshtelma/spark that referenced this pull request Apr 5, 2018
…ionSummary

## What changes were proposed in this pull request?

Adding r2adj in LinearRegressionSummary for Python API.

## How was this patch tested?

Added unit tests to exercise the api calls for the summary classes in tests.py.

Author: Kevin Yu <[email protected]>

Closes apache#20842 from kevinyu98/spark-23162.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants