Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-1535] ALS: Avoid the garbage-creating ctor of DoubleMatrix #442

Closed
wants to merge 2 commits into from

Conversation

tmyklebu
Copy link
Contributor

new DoubleMatrix(double[]) creates a garbage double[] of the same length as its argument and immediately throws it away. This pull request avoids that constructor in the ALS code.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14226/

@tmyklebu
Copy link
Contributor Author

This appears to be a PySpark error unrelated to my change.

@srowen
Copy link
Member

srowen commented Apr 18, 2014

+1 BTW I submitted a fix for this in jblas, although it is not yet released.

* Wrap a double array in a DoubleMatrix without creating garbage.
*/
private def wrapDoubleArray(v: Array[Double]): DoubleMatrix = {
new DoubleMatrix(v.length, 1, v:_*)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a space after :

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(PS might also note in the comment that it's safe to go back to the old DoubleMatrix constructor at jblas 1.2.4 or later.)

@mengxr
Copy link
Contributor

mengxr commented Apr 18, 2014

@tmyklebu Thanks for fixing this! Could you create a JIRA?

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14238/

@tmyklebu
Copy link
Contributor Author

[SPARK-1535] describes the issue and the form of the fix.

@mengxr
Copy link
Contributor

mengxr commented Apr 18, 2014

@tmyklebu Could you add the JIRA number to the title of this PR? It makes life easier for people who merge the code.

@tmyklebu tmyklebu changed the title ALS: Avoid the garbage-creating ctor of DoubleMatrix [SPARK-1535] ALS: Avoid the garbage-creating ctor of DoubleMatrix Apr 18, 2014
@mengxr
Copy link
Contributor

mengxr commented Apr 19, 2014

LGTM. Thanks for fixing it! @mateiz Could you help merge this into both master and branch-1.0?

@tmyklebu
Copy link
Contributor Author

@srowen: Does Hotspot actually generate code for the allocation and the dead store with the bad ctor? I haven't picked through it yet.

@srowen
Copy link
Member

srowen commented Apr 19, 2014

I doubt Hotspot can remove this allocation, but am not sure. It would have to do a couple things -- inline the call to 2 other constructors, realize there's no threading issue in the constructor, realize the dead store. I think it's worth avoiding manually here.

@mateiz
Copy link
Contributor

mateiz commented Apr 19, 2014

Merged into master and branch 1.0, thanks!

asfgit pushed a commit that referenced this pull request Apr 19, 2014
`new DoubleMatrix(double[])` creates a garbage `double[]` of the same length as its argument and immediately throws it away.  This pull request avoids that constructor in the ALS code.

Author: Tor Myklebust <[email protected]>

Closes #442 from tmyklebu/foo2 and squashes the following commits:

2784fc5 [Tor Myklebust] Mention that this is probably fixed as of jblas 1.2.4; repunctuate.
a09904f [Tor Myklebust] Helper function for wrapping Array[Double]'s with DoubleMatrix's.

(cherry picked from commit 25fc318)
Signed-off-by: Matei Zaharia <[email protected]>
@asfgit asfgit closed this in 25fc318 Apr 19, 2014
pwendell added a commit to pwendell/spark that referenced this pull request May 12, 2014
Workers should use working directory as spark home if it's not specified

If users don't set SPARK_HOME in their environment file when launching an application, the standalone cluster should default to the spark home of the worker.
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
`new DoubleMatrix(double[])` creates a garbage `double[]` of the same length as its argument and immediately throws it away.  This pull request avoids that constructor in the ALS code.

Author: Tor Myklebust <[email protected]>

Closes apache#442 from tmyklebu/foo2 and squashes the following commits:

2784fc5 [Tor Myklebust] Mention that this is probably fixed as of jblas 1.2.4; repunctuate.
a09904f [Tor Myklebust] Helper function for wrapping Array[Double]'s with DoubleMatrix's.
andrewor14 pushed a commit to andrewor14/spark that referenced this pull request Jan 8, 2015
Workers should use working directory as spark home if it's not specified

If users don't set SPARK_HOME in their environment file when launching an application, the standalone cluster should default to the spark home of the worker.
(cherry picked from commit 59f475c)

Signed-off-by: Patrick Wendell <[email protected]>
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Nov 7, 2017
Required for ./dev/check-license to pass
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
Fix error of running trovestack in openlab job
arjunshroff pushed a commit to arjunshroff/spark that referenced this pull request Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants