-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6080] [PySpark] correct LogisticRegressionWithLBFGS regType parameter for pyspark #4831
Conversation
Can one of the admins verify this patch? |
add to whitelist |
ok to test |
Test build #28150 has started for PR 4831 at commit
|
@@ -207,7 +207,7 @@ def train(cls, data, iterations=100, initialWeights=None, regParam=0.01, regType | |||
""" | |||
def train(rdd, i): | |||
return callMLlibFunc("trainLogisticRegressionModelWithLBFGS", rdd, int(iterations), i, | |||
float(regParam), str(regType), bool(intercept), int(corrections), | |||
float(regParam), regType, bool(intercept), int(corrections), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we use str(regType) if regType else None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think directly use regType is enough, because py4j can translate "l1","l2",None in Python to "l1","l2",null in Java/Scala smoothly.
I found the peer functions LogisticRegressionWithSGD and SVMWithSGD also directly use regType and it can work well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is the error message. If we use str(...)
, the Java function call would be successful and inside the Java function, we say the input regType is not recognized.
If we don't use str(..)
, if users input something like 1
, 2
. Py4j will throw an error saying there is no Java function matching the input signature, which usually confuses users.
Anyway, this is a minor issue for regType
.
Test build #28150 has finished for PR 4831 at commit
|
Test PASSed. |
LGTM. Merged into master and branch-1.3. Thanks! |
…rameter for pyspark Currently LogisticRegressionWithLBFGS in python/pyspark/mllib/classification.py will invoke callMLlibFunc with a wrong "regType" parameter. It was assigned to "str(regType)" which translate None(Python) to "None"(Java/Scala). The right way should be translate None(Python) to null(Java/Scala) just as what we did at LogisticRegressionWithSGD. Author: Yanbo Liang <[email protected]> Closes #4831 from yanboliang/pyspark_classification and squashes the following commits: 12db65a [Yanbo Liang] correct LogisticRegressionWithLBFGS regType parameter for pyspark (cherry picked from commit af2effd) Signed-off-by: Xiangrui Meng <[email protected]>
Currently LogisticRegressionWithLBFGS in python/pyspark/mllib/classification.py will invoke callMLlibFunc with a wrong "regType" parameter.
It was assigned to "str(regType)" which translate None(Python) to "None"(Java/Scala). The right way should be translate None(Python) to null(Java/Scala) just as what we did at LogisticRegressionWithSGD.