-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize numpy operations in MKDADensity Estimator and (M)KDAKernel #685
Conversation
Thanks for opening this pull request! We have detected this is the first time you have contributed to NiMARE. Please check out our contributing guidelines.
Of course, if you want to opt out this time there is no problem at all with adding your name later. You will be always welcome to add it in the future whenever you feel it should be listed. |
From this issue: numpy/numpy#11136 It seems to be significantly faster. All in all, for a dataset w/ ~100 studies, it went from ~0.53 to ~0.13s., possibily shaving off ~0.4s / 1.5s for the |
Replaced with scikit image version: https://github.com/scikit-image/scikit-image/blob/64103e6c90917fcfdef8343fd7dd4df32c910446/skimage/util/unique.py#L47 Down to 0.962869 s from ~1.6 on the last comparison |
Profiling results using the "neurosynth_laird_studies.json" dataset (1117 coordinates in 17 studies).
It's looking like a 2.5x+ speed up, just optimizing Update, after a few more tweaks:
~2.94x speed up |
Codecov Report
@@ Coverage Diff @@
## main #685 +/- ##
==========================================
+ Coverage 85.28% 85.31% +0.02%
==========================================
Files 41 41
Lines 4507 4521 +14
==========================================
+ Hits 3844 3857 +13
- Misses 663 664 +1
Continue to review full report at Codecov.
|
Ugh, |
# return ma_values.T.dot(self.weight_vec_).ravel() | ||
weighted_ma_vals = ma_values * self.weight_vec_ | ||
return weighted_ma_vals.sum(0) | ||
return ma_values.T.dot(self.weight_vec_).ravel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a potential edge case that is not worth the slow down, so I went w/ the faster version.
Anyone w mac want to verify if this is still an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can test it out tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests.test_meta_mkda.test_MKDADensity_montecarlo_null
(after changing n_cores to 2 and n_iters to 10) went from 6.52s to 5.68s after making the change on my MacBook Pro (OSX 12.3.1). Maybe the architecture Tal was referring to is the M1 chip? Another possibility is that the problem was unique to multiprocessing, and when I switched to joblib it resolved the issue without anyone noticing.
In case anyone wants to dig further into it, the commit where the workaround was added is 68c515a.
nimare/meta/kernel.py
Outdated
imgs = [] | ||
# Loop over exp ids since sparse._coo.core.COO is not iterable | ||
for i_exp, id_ in enumerate(transformed_maps[1]): | ||
if isinstance(transformed_maps[0][i_exp], sparse._coo.core.COO): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JulioAPeraza is this check necessary? it's slow. I don't see why it wouldn't be a spare matrix, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently this check is necessary, because without it tests fails. However, it shouldn't be necessary because the output should be sparse here.
It's not in ALEKernel
however.
This is inconsistent w/ the API because if you say "return_type='sparse'" you will get back a dense matrix w/ an ALEKernel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to #692.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that I missed this comment. I created a new folder for GitHub email and I didn't get notified of this one. I added a comment to the new issue.
I think you must have used a different version of black, because all files should have already had black run on them and the linter is currently failing. |
By the way, I think the other major speed up we could potentially get is related to Currently,
with This means we spend a lot of time going from sparse --> dense. We save a tiny bit of memory doing this because we loop over experiments and only keep the masked voxels, but its slow. Starting from |
@tsalo I believe this is ready to go as soon as you approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand the changes to the code, and they look good to me.
I have some notes on the comments though- mostly just adding TODO to comments we might want to tackle in a future PR.
I totally forgot- can you also add |
Yes, and actually I moved it to |
Agreed! I'm sure there'll be a few places where it could come in handy. |
1 similar comment
Agreed! I'm sure there'll be a few places where it could come in handy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. I might write a test or two for the new function (or just copy them from skimage) in a future PR, but I don't want to block this one any longer. Once CI passes, feel free to merge.
I'm just going to change the title slightly, to match the pattern in other PRs.
EDIT: I also realized the changes were made to the kernel and the MKDA (rather than KDA) Estimator.
np.unique
->unique_rows
Profile summary: Around a ~2.5-3x speed up for
correct_fwe_montecarlo
on a dataset with ~1000 coordinates