Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3418] Sparse Matrix support (CCS) and additional native BLAS operations added #2294

Closed
wants to merge 14 commits into from

Conversation

brkyvz
Copy link
Contributor

@brkyvz brkyvz commented Sep 5, 2014

Local SparseMatrix support added in Compressed Column Storage (CCS) format in addition to Level-2 and Level-3 BLAS operations such as dgemv and dgemm respectively.

BLAS doesn't support sparse matrix operations, therefore support for SparseMatrix-DenseMatrix multiplication and SparseMatrix-DenseVector implementations have been added. I will post performance comparisons in the comments momentarily.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

QA tests have started for PR 2294 at commit 4362ff1.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

QA tests have finished for PR 2294 at commit 4362ff1.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparseMatrix(val numRows: Int,
    • case class AddJar(path: String) extends LeafNode with Command

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

QA tests have started for PR 2294 at commit 8dcb763.

  • This patch merges cleanly.

@brkyvz
Copy link
Contributor Author

brkyvz commented Sep 6, 2014

The following tests were run on a Mac OS X 10.9.3
2.8 GHz Intel Core i7
8 GB 1600 MHz DDR3.

Hyper-threading was disabled and turbo boost was turned off. Effective numRows or Effective numCols stand for the effective number of rows or columns that are used in the computation. As the transpose multiplications are also used, that means that the number of rows of A, which has dimensions m x n may be m, but the effective number of rows are n.

vary_cols_rows100
vary_cols_rows500
vary_cols_t_2500rows
vary_colsb_rows100
vary_columns
vary_rowsa_cols25000
vary_sparsity

@SparkQA
Copy link

SparkQA commented Sep 6, 2014

QA tests have finished for PR 2294 at commit 8dcb763.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparseMatrix(val numRows: Int,

@SparkQA
Copy link

SparkQA commented Sep 6, 2014

QA tests have started for PR 2294 at commit 41b2da3.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 6, 2014

QA tests have finished for PR 2294 at commit 41b2da3.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparseMatrix(val numRows: Int,

@SparkQA
Copy link

SparkQA commented Sep 6, 2014

QA tests have started for PR 2294 at commit 56d7c85.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 6, 2014

QA tests have finished for PR 2294 at commit 56d7c85.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparseMatrix(

@SparkQA
Copy link

SparkQA commented Sep 6, 2014

QA tests have started for PR 2294 at commit 848406c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 6, 2014

QA tests have finished for PR 2294 at commit 848406c.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparseMatrix(

@mengxr
Copy link
Contributor

mengxr commented Sep 8, 2014

test this please

@SparkQA
Copy link

SparkQA commented Sep 8, 2014

QA tests have started for PR 2294 at commit 848406c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 8, 2014

QA tests have finished for PR 2294 at commit 848406c.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparseMatrix(

* @param C the resulting matrix C. Size of m x n.
*/
def gemm(
transA: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change type to Boolean?

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have finished for PR 2294 at commit d162684.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • sealed trait Matrix extends Serializable
    • class SparseMatrix(
    • sealed trait Vector extends Serializable
    • Seq( // Ignore new functions added to traitMatrix. They are sealed traits now.

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have started for PR 2294 at commit 272feb9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have finished for PR 2294 at commit 272feb9.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • sealed trait Matrix extends Serializable
    • class SparseMatrix(
    • sealed trait Vector extends Serializable

@brkyvz
Copy link
Contributor Author

brkyvz commented Sep 18, 2014

@ScrapCodes, I have some mima incompatibility issues here. I've added a few methods to the trait Matrix in MLlib, and that causes a MissingMethodProblem. I sealed the trait, and added the exclude statements to the MimaExcludes, but the tests still fail.
dev/mima passes on my local machine.
Do you have any idea of what might be going on? The commits d162684 and
272feb9 were to address this issue.

Can you please take a look?

@ScrapCodes
Copy link
Member

Can you rebase to the tip of master and add tests in version 1.2 section:
https://github.com/apache/spark/blob/master/project/MimaExcludes.scala#L37

instead of 1.1.

Prashant Sharma

On Fri, Sep 19, 2014 at 4:42 AM, Burak Yavuz [email protected]
wrote:

@ScrapCodes https://github.com/ScrapCodes, I have some mima
incompatibility issues here. I've added a few methods to the trait Matrix
in MLlib, and that causes a MissingMethodProblem. I sealed the trait, and
added the exclude statements to the MimaExcludes, but the tests still fail.
dev/mima passes on my local machine.
Do you have any idea of what might be going on? The commits d162684
d162684
and
272feb9
272feb9
were to address this issue.

Can you please take a look?


Reply to this email directly or view it on GitHub
#2294 (comment).

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have started for PR 2294 at commit 88814ed.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 19, 2014

QA tests have finished for PR 2294 at commit 88814ed.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • sealed trait Matrix extends Serializable
    • class SparseMatrix(
    • sealed trait Vector extends Serializable

@brkyvz
Copy link
Contributor Author

brkyvz commented Sep 19, 2014

@ScrapCodes THANKS A LOT! That fixed it! I didn't realize I didn't update my local repo for such a long time.

@mengxr
Copy link
Contributor

mengxr commented Sep 19, 2014

LGTM. I'm merging this into master. Thanks! (We might need to make slight changes to some methods before the 1.2 release, but let's not block the multi-model training PR for now.)

@asfgit asfgit closed this in e76ef5c Sep 19, 2014
@brkyvz brkyvz deleted the SPARK-3418 branch January 30, 2015 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants