Skip to content

michaelochurch/terareg

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Matrix Operations

There are three operations we need to perform on matrixes:

  1. pseudoinverse (pinv, ^-1)
  2. transposition (tr, ^T)
  3. multiplication (mul, X)

To handle larger sets of data, these will need to be performed on subsets of the data, stored in scratch space, and then folded into the final resulting matrix. The processing of these subsets can be scheduled weighting cache locality and freshness to minimize cache churn across cores.

Pipeline and Fusion

The basic linear regression is:

ols x y = (pinv ((tr x) `mul` x )) `mul` ((tr x) `mul` y)

Breaking this down into operations:

i0 = x^T
i1 = i0 X x
pi = i1^-1
i3 = i0 X y
i4 = pi X i3

The data dependency graph:

i0 <- x
i1 <- x, i0
i2 <- i1
i3 <- i0, y
i4 <- i2, i3

And the data pipeline:

   /----------\
   |          |
B1 x -> i0 -> i1 -> pi ---\
         |             B3 |--> i4
B2       |     y -> i3 ---/
         |          |
         `----------/ 

References

  1. AVX Matrix transposition
  2. Survey of parallel matrix multiplication
  3. Communication-Optimal Parallel Algorithm for Strassen’s Matrix Multiplication

About

Terabyte linear regression in Haskell.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Haskell 100.0%