Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify organization of base classes #675

Merged
merged 18 commits into from
May 4, 2022

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Apr 26, 2022

Closes None, but does touch on #671.

Changes proposed in this pull request:

  • Make _check_ncores a function in nimare.utils (see RF Suggestion: Instance methods vs helper functions #671).
  • Remove the Transformer base class.
    • It only had the transform abstract method.
  • Move the Decoder base class to nimare.decode.base.
  • Create a new IBMAEstimator base class, and move all IBMA-related code from MetaEstimator into that new class.
  • Move all CBMA-related code from MetaEstimator into CBMAEstimator.
  • Remove MetaEstimator.
  • Rename the _validate_input method to _collect_inputs.
  • Convert generic kwargs in CBMAEstimator and IBMAEstimator into actual keyword arguments, and raise a warning for any extraneous kwargs.

@tsalo tsalo added the refactoring Requesting changes to the code which do not impact behavior label Apr 26, 2022
@codecov
Copy link

codecov bot commented Apr 26, 2022

Codecov Report

Merging #675 (4d025cf) into main (6cd751e) will increase coverage by 0.02%.
The diff coverage is 93.96%.

@@            Coverage Diff             @@
##             main     #675      +/-   ##
==========================================
+ Coverage   85.28%   85.31%   +0.02%     
==========================================
  Files          40       41       +1     
  Lines        4513     4528      +15     
==========================================
+ Hits         3849     3863      +14     
- Misses        664      665       +1     
Impacted Files Coverage Δ
nimare/utils.py 94.98% <75.00%> (-0.49%) ⬇️
nimare/base.py 84.82% <83.33%> (-3.68%) ⬇️
nimare/decode/base.py 91.11% <91.11%> (ø)
nimare/meta/cbma/base.py 95.75% <92.10%> (-0.77%) ⬇️
nimare/meta/ibma.py 99.48% <98.59%> (-0.52%) ⬇️
nimare/correct.py 93.06% <100.00%> (ø)
nimare/decode/continuous.py 94.73% <100.00%> (ø)
nimare/decode/discrete.py 96.61% <100.00%> (ø)
nimare/diagnostics.py 93.61% <100.00%> (ø)
nimare/meta/cbma/ale.py 94.17% <100.00%> (+0.08%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6cd751e...4d025cf. Read the comment docs.

@tsalo tsalo requested review from jdkent and adelavega April 26, 2022 23:27
@adelavega
Copy link
Member

I will take a look as soon as I get good chance!

@tsalo
Copy link
Member Author

tsalo commented Apr 28, 2022

My only concern about separating CBMA and IBMA estimators so much is SDM, but we can always circle back to that when we start working on implementing SDM.

Copy link
Member

@adelavega adelavega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

  • +1 on moving _check_ncores to utils
  • I'm on the fence about removing the Transformer class since it does serve the purpose of enforcing the transform method is implemented by all children. It does add one more layer of abstraction but it sort of makes sense to me to keep this one. I could go either way honestly, maybe we can err on the side of removing it since it doesn't add much.
  • +1 on moving Decoder
  • +1 on separating CBMA and IBMA. If there is a concern that there is common logic that could crop up, we could always add add the MetaEstimator class back in, and use that to control common logic. But if it as its stands there's almost none, then I think its more parsimonious to remove.
  • Should _preprocess_input be an abstractmethod in Estimator? It seems like all children implement it, and probably should. I suppose there could be an instance of an estimator that doesn't, but since they all do lets enforce it.

An observation:

What simplifies understanding Estimators to me is that they all have a fit method which calls: _validate_input, then _preprocess_input, and then _fit-- in this order. These methods must always be implemented.

To further simplify reading the code, lets make sure all children implement them in this order (i.e. in CBMAEstimator _fit comes before _preprocess_input, so let's reverse that), and that these three functions are the top three after __init__.

This at least makes the major code flow easier to read.

The other issue for me is that CBMAEstimator implements so many other methods that are called within _fit and other "primary" methods. It's possible we could pull some of these out of the class entirely if they can be defined as pure functions (i.e. don't need access to that many class internals), or otherwise simplified. Let's hold off for another PR until we profile though since it has implications for that.

Specifically, CBMAEstimator._compute_summarystat_est and Estimator._preprocess_input.
@tsalo
Copy link
Member Author

tsalo commented Apr 29, 2022

Should _preprocess_input be an abstractmethod in Estimator? It seems like all children implement it, and probably should. I suppose there could be an instance of an estimator that doesn't, but since they all do lets enforce it.

I think that's reasonable. Estimator don't necessarily need to have _preprocess_input, but all do in practice. CBMAEstimator._compute_summarystat_est also should be an abstract method, so I've done that as well.

To further simplify reading the code, lets make sure all children implement them in this order (i.e. in CBMAEstimator _fit comes before _preprocess_input, so let's reverse that), and that these three functions are the top three after init.

I believe I've done that in this PR.

The other issue for me is that CBMAEstimator implements so many other methods that are called within _fit and other "primary" methods. It's possible we could pull some of these out of the class entirely if they can be defined as pure functions (i.e. don't need access to that many class internals), or otherwise simplified. Let's hold off for another PR until we profile though since it has implications for that.

+1 on this. All of the obvious ones have now been moved out, so I think it's going to be a matter of determining how many class internals (and how big) we should consider "too many".

@adelavega
Copy link
Member

Great. LGTM then!

@tsalo tsalo merged commit b186fb4 into neurostuff:main May 4, 2022
@tsalo tsalo deleted the streamline-base-classes branch May 4, 2022 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Requesting changes to the code which do not impact behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants