-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend BLAS interface #3173
Extend BLAS interface #3173
Conversation
…ub.com/GEOS-DEV/GEOS into feature/dkachuma/extend-blas-interface
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3173 +/- ##
===========================================
+ Coverage 55.77% 55.89% +0.11%
===========================================
Files 1035 1037 +2
Lines 87914 88148 +234
===========================================
+ Hits 49035 49271 +236
+ Misses 38879 38877 -2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just some minor comments... Also: should we put a warning somewhere that this is not supported on GPUs (apparently)?
For the row-major version, I couldn't figure out how to call BLAS/LAPACK. So I was forced to allocate space anyway for the transposition of the solution matrix
It seems that lapacke follows this same approach and there's nothing much we can do, see:
https://github.com/Reference-LAPACK/lapack/blob/8b468db25c0c5a25d8e0020c7e2134e14cfd33d0/LAPACKE/src/lapacke_dgetrf_work.c
https://github.com/Reference-LAPACK/lapack/blob/8b468db25c0c5a25d8e0020c7e2134e14cfd33d0/LAPACKE/utils/lapacke_dge_trans.c
// This might require an extra allocation | ||
real64 * solutionData = X.dataIfContiguous(); | ||
array2d< real64, MatrixLayout::COL_MAJOR_PERM > X0; | ||
if constexpr ( USD == MatrixLayout::ROW_MAJOR ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to create a helper function for this operations (and also save some lines later when we transpose the data back)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea! Added some helpers.
I actually looked at lapacke to get some inspiration and was gravely disappointed to see that they just copy. I also considered in-place transposes but they are so complicated and quite unreadable. |
Yes, avoiding the copy would be great... Hmm, if this is to be called inside a FEM kernel or, generally speaking, called multiple times, I guess we could allocate the extra work space only once outside the |
Wasn't sure why kind of threaded code this is called from. Didn't want to introduce something that might complicate that. But I'm sure it can be done. Anyhow the current uses of this are only 1d so we shouldn't be hitting this issue yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I am afraid that doing that copy is the only solution.
Extends the BLAS/LAPACK interface to include linear solves where the solution and rhs are matrices. The addition is a function to solve
$$AX=B$$ $A$ is an $n{\times}n$ matrix and $X$ and $B$ are $n{\times}m$ matrices.
where
This also adds an "in-place" version of$A$ and the rhs $B$ only. In using this, the caller is allowing LAPACK to destroy both the $A$ and $B$ . On exit $A$ is replaced by an LU factorisation of $A$ and $B$ is replaced by the solution (in true LAPACK style). This has the advantage of not requiring extra allocations of memory within the call.
solveLinearSystem
. This takes the matrixFor the row-major version, I couldn't figure out how to call BLAS/LAPACK. So I was forced to allocate space anyway for the transposition of the solution matrix$X$ . Two exceptions to this are 1) $m=1$ the vector case in which case row-major vs col-major is meaningless and 2) $m=n$ in which case $X$ can be transposed in place.