Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize numpy operations in MKDADensity Estimator and (M)KDAKernel #685

Merged
merged 36 commits into from
Jun 2, 2022

Conversation

adelavega
Copy link
Member

@adelavega adelavega commented May 28, 2022

  • Speeds in KDA meta-analysis, especially in the context of monte-carlo correction
  • Main gains came from optimizing numpy operations, especially switchingout np.unique -> unique_rows
  • Also chunked kernel operations (i.e. convolution & masking), by experiment ID, which is more efficient

Profile summary: Around a ~2.5-3x speed up for correct_fwe_montecarlo on a dataset with ~1000 coordinates

@welcome
Copy link

welcome bot commented May 28, 2022

Thanks for opening this pull request! We have detected this is the first time you have contributed to NiMARE. Please check out our contributing guidelines.
We invite you to list yourself as a NiMARE contributor, so if your name is not already mentioned, please modify the .zenodo.json file with your data right above Angie's entry. Example:

{
  "name": "Contributor, New",
  "affiliation": "Department of Psychology, Some University",
  "orcid": "<your id>"
},
{
  "name": "Laird, Angela R.",
  "affiliation": "Florida International University",
  "orcid": "0000-0003-3379-8744"
},

Of course, if you want to opt out this time there is no problem at all with adding your name later. You will be always welcome to add it in the future whenever you feel it should be listed.

@adelavega
Copy link
Member Author

adelavega commented May 28, 2022

From this issue: numpy/numpy#11136
I found an implementation that uses pandas and hashing to get unique rows wihout sorting for 2D arrays.

It seems to be significantly faster. All in all, for a dataset w/ ~100 studies, it went from ~0.53 to ~0.13s., possibily shaving off ~0.4s / 1.5s for the _correct_fwe_montecarlo_permutation function overall.

@adelavega
Copy link
Member Author

adelavega commented May 28, 2022

Replaced with scikit image version: https://github.com/scikit-image/scikit-image/blob/64103e6c90917fcfdef8343fd7dd4df32c910446/skimage/util/unique.py#L47

Down to 0.962869 s from ~1.6 on the last comparison

@adelavega
Copy link
Member Author

adelavega commented May 31, 2022

Profiling results using the "neurosynth_laird_studies.json" dataset (1117 coordinates in 17 studies).

correct_fwe_montecarlo

n_iter main speed
100 88 s 35.2s

It's looking like a 2.5x+ speed up, just optimizing compute_kda_ma


Update, after a few more tweaks:

n_iter main speed
500 441s 150s

~2.94x speed up

@codecov
Copy link

codecov bot commented May 31, 2022

Codecov Report

Merging #685 (20504fc) into main (7555c90) will increase coverage by 0.02%.
The diff coverage is 97.14%.

@@            Coverage Diff             @@
##             main     #685      +/-   ##
==========================================
+ Coverage   85.28%   85.31%   +0.02%     
==========================================
  Files          41       41              
  Lines        4507     4521      +14     
==========================================
+ Hits         3844     3857      +13     
- Misses        663      664       +1     
Impacted Files Coverage Δ
nimare/utils.py 94.52% <87.50%> (-0.17%) ⬇️
nimare/meta/cbma/mkda.py 97.08% <100.00%> (-0.01%) ⬇️
nimare/meta/kernel.py 80.12% <100.00%> (ø)
nimare/meta/utils.py 54.29% <100.00%> (+1.49%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7555c90...20504fc. Read the comment docs.

@adelavega
Copy link
Member Author

Ugh, black made changes in tons of files. Shouldn't black already have been run on main?

# return ma_values.T.dot(self.weight_vec_).ravel()
weighted_ma_vals = ma_values * self.weight_vec_
return weighted_ma_vals.sum(0)
return ma_values.T.dot(self.weight_vec_).ravel()
Copy link
Member Author

@adelavega adelavega May 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a potential edge case that is not worth the slow down, so I went w/ the faster version.

Anyone w mac want to verify if this is still an issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can test it out tomorrow.

Copy link
Member

@tsalo tsalo Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests.test_meta_mkda.test_MKDADensity_montecarlo_null (after changing n_cores to 2 and n_iters to 10) went from 6.52s to 5.68s after making the change on my MacBook Pro (OSX 12.3.1). Maybe the architecture Tal was referring to is the M1 chip? Another possibility is that the problem was unique to multiprocessing, and when I switched to joblib it resolved the issue without anyone noticing.

In case anyone wants to dig further into it, the commit where the workaround was added is 68c515a.

imgs = []
# Loop over exp ids since sparse._coo.core.COO is not iterable
for i_exp, id_ in enumerate(transformed_maps[1]):
if isinstance(transformed_maps[0][i_exp], sparse._coo.core.COO):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JulioAPeraza is this check necessary? it's slow. I don't see why it wouldn't be a spare matrix, right?

Copy link
Member Author

@adelavega adelavega Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this check is necessary, because without it tests fails. However, it shouldn't be necessary because the output should be sparse here.

It's not in ALEKernel however.

This is inconsistent w/ the API because if you say "return_type='sparse'" you will get back a dense matrix w/ an ALEKernel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to #692.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I missed this comment. I created a new folder for GitHub email and I didn't get notified of this one. I added a comment to the new issue.

nimare/meta/utils.py Outdated Show resolved Hide resolved
@tsalo
Copy link
Member

tsalo commented May 31, 2022

I think you must have used a different version of black, because all files should have already had black run on them and the linter is currently failing.

@adelavega
Copy link
Member Author

By the way, I think the other major speed up we could potentially get is related to sparse memory usage.

Currently, compute_kda_ma returns a sparse matrix, which correct_fwe_montecarlo then converts to image because it calls:

self.kernel_transformer.transform(
            iter_df, masker=self.masker, return_type="array"
        )

with return_type == 'array`.

This means we spend a lot of time going from sparse --> dense. We save a tiny bit of memory doing this because we loop over experiments and only keep the masked voxels, but its slow.

Starting from dense and staying that would should speed things up, but it's not clear to me by how much. It also seems like memory is a bigger problem than speed, so it may not be worth it.

@JulioAPeraza

@adelavega adelavega changed the title WIP: Speed up KDA MA Optimize KDA Meta-Analysis Jun 1, 2022
@tsalo tsalo added this to the 0.0.12 milestone Jun 2, 2022
@adelavega
Copy link
Member Author

@tsalo I believe this is ready to go as soon as you approve

Copy link
Member

@tsalo tsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the changes to the code, and they look good to me.

I have some notes on the comments though- mostly just adding TODO to comments we might want to tackle in a future PR.

nimare/meta/kernel.py Outdated Show resolved Hide resolved
nimare/meta/kernel.py Outdated Show resolved Hide resolved
nimare/meta/kernel.py Outdated Show resolved Hide resolved
nimare/meta/kernel.py Outdated Show resolved Hide resolved
nimare/meta/utils.py Show resolved Hide resolved
nimare/meta/utils.py Outdated Show resolved Hide resolved
@tsalo
Copy link
Member

tsalo commented Jun 2, 2022

I totally forgot- can you also add unique_rows to docs/api.rst? Sorry about that!

@tsalo tsalo added refactoring Requesting changes to the code which do not impact behavior cbma Issues/PRs pertaining to coordinate-based meta-analysis labels Jun 2, 2022
@adelavega
Copy link
Member Author

Yes, and actually I moved it to nimare.utils which makes more sense. We could presumably reuse it throughout the package, not just in cbma. Agree?

@tsalo
Copy link
Member

tsalo commented Jun 2, 2022

Agreed! I'm sure there'll be a few places where it could come in handy.

1 similar comment
@tsalo
Copy link
Member

tsalo commented Jun 2, 2022

Agreed! I'm sure there'll be a few places where it could come in handy.

Copy link
Member

@tsalo tsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me. I might write a test or two for the new function (or just copy them from skimage) in a future PR, but I don't want to block this one any longer. Once CI passes, feel free to merge.

I'm just going to change the title slightly, to match the pattern in other PRs.

EDIT: I also realized the changes were made to the kernel and the MKDA (rather than KDA) Estimator.

@tsalo tsalo changed the title Optimize KDA Meta-Analysis Optimize numpy operations in MKDADensity Estimator and (M)KDAKernel Jun 2, 2022
@adelavega adelavega merged commit 2b9229f into main Jun 2, 2022
@adelavega adelavega deleted the speed branch June 2, 2022 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cbma Issues/PRs pertaining to coordinate-based meta-analysis refactoring Requesting changes to the code which do not impact behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants