Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYSTEMDS-3636] Improved ultra-sparse tsmm left w/ sparse output #1955

Conversation

christinadionysio
Copy link
Contributor

This patch provides the support for left transposed ultra-sparse tsmm.
Similar to the the implementation of the right transpose ultra-sparse tsmm,
binary search is used to populate the upper triangular part of a sparse output matrix.

@Baunsgaard
Copy link
Contributor

how much faster is it?

@christinadionysio
Copy link
Contributor Author

how much faster is it?

@Baunsgaard I only tested the t(X) %*% X operation on ultra sparse matrices locally for now. Using the germany_osm dataset we achieve a runtime of ~12s after the optimisation. (It is infeasible to run it locally without the optimisation, since a dense matrix is allocated and therefore a Java heap space exception occurs.)
When I run the operation on an ultra sparse matrix with 30000 rows and cols I get a speedup of 17x or 30x depending on which of the two methods is used.

Copy link
Contributor

@Baunsgaard Baunsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments for improvements, hope you can decrypt it.

@christinadionysio
Copy link
Contributor Author

After the changes the runtime of the matrixMultTransposeSelfUltraSparse method improved by 2.5x.

@j143 j143 added this to the systemds-3.2.0 milestone Dec 17, 2023
@Baunsgaard
Copy link
Contributor

LGTM

@Baunsgaard
Copy link
Contributor

Thanks for the PR, it is now merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants