-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6845] [MLlib] [PySpark] Add isTranposed flag to DenseMatrix #5455
Conversation
ping @mengxr Would be great if you could have a look. Also, there is one existing test that fails due to serialization (which I have commented out) |
Test build #30030 has finished for PR 5455 at commit
|
Test build #30038 has finished for PR 5455 at commit
|
The test failure seems unrelated. |
Test build #30168 has finished for PR 5455 at commit
|
@MechCoder I think you need to update the matrix SerDe in PythonMLlibAPI. Could you double check? |
@mengxr I pushed a non-working update in the last commit just to get feedback.
|
Test build #30262 has finished for PR 5455 at commit
|
@mengxr It would be really helpful if you could guide me on my two questions? |
Test build #30457 has finished for PR 5455 at commit
|
@MechCoder I'm not an expert here, but I'll ask around. I'm wondering if the failure in PythonMLLibAPISuite indicates that the issue is with the tuple (or tuple code), rather than with how isTransposed is stored. I'm pretty sure storing the Boolean as an Int is fine. |
@@ -985,6 +985,7 @@ private[spark] object SerDe extends Serializable { | |||
val m: DenseMatrix = obj.asInstanceOf[DenseMatrix] | |||
val bytes = new Array[Byte](8 * m.values.size) | |||
val order = ByteOrder.nativeOrder() | |||
val isTransposed = if (m.isTransposed) 1 else 0 | |||
ByteBuffer.wrap(bytes).order(order).asDoubleBuffer().put(m.values) | |||
|
|||
out.write(Opcodes.BININT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this line, put out.write(Opcodes.MARK)
, which marks the start of the tuple.
@mengxr fixed ! |
Test build #30665 has finished for PR 5455 at commit
|
} | ||
val bytes = getBytes(args(2)) | ||
val n = bytes.length / 8 | ||
val values = new Array[Double](n) | ||
val order = ByteOrder.nativeOrder() | ||
val isTransposed = if (args(3).asInstanceOf[Int] == 1) true else false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this line after ByteBuffer.wrap...
and simplify it to val isTransposed = args(3).asInstanceOf[Int] == 1
.
LGTM except one minor inline comment. |
done |
Test build #30682 has finished for PR 5455 at commit
|
Merged into master. Thanks! |
@mengxr Thanks! I think we need to support serialization and deserialization for SparseMatrices also? |
Since sparse matrices now support a isTransposed flag for row major data, DenseMatrices should do the same. Author: MechCoder <[email protected]> Closes apache#5455 from MechCoder/spark-6845 and squashes the following commits: 525c370 [MechCoder] minor 004a37f [MechCoder] Cast boolean to int 151f3b6 [MechCoder] [WIP] Add isTransposed to pickle DenseMatrix cc0b90a [MechCoder] [SPARK-6845] Add isTranposed flag to DenseMatrix
Since sparse matrices now support a isTransposed flag for row major data, DenseMatrices should do the same.