-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable MBAR to do bootstrap error estimation #322
Conversation
Codecov Report
@@ Coverage Diff @@
## master #322 +/- ##
=======================================
Coverage 98.75% 98.75%
=======================================
Files 27 27
Lines 1762 1767 +5
Branches 388 389 +1
=======================================
+ Hits 1740 1745 +5
Misses 2 2
Partials 20 20
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking good, by under semver we can't just change the way that ABFE computes the error. We need to deprecate the default (analytical) and then switch over to n_bootstraps=50
in, say, release 2.3.
The kwargs handling of n_bootstraps needs improving (and possibly a test — sorry).
src/alchemlyb/workflows/abfe.py
Outdated
"By default, n_bootstraps=50 is used to estimate the " | ||
"MBAR error. Supply n_bootstraps=0 to use analytic error." | ||
) | ||
estimator_kwargs = {"n_bootstraps": 50, **kwargs} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail when I actually supply n_bootstraps
as a kwarg because then the dict contains the same key twice ... a test would have caught it.
Instead, I would do
# use 0 as default to remain backwards-compatible
# change in release 2.3
kwargs.setdefault('n_bootstraps', 0)
...
...
if estimator == "MBAR":
estimator_kwargs = kwargs
self.logger.info("Run MBAR estimator")
self.logger.info("Estimating MBAR error with n_bootstraps={n_bootstraps} (0 is analytical estimate)", estimator_kwargs)
elif estimator == "BAR":
estimator_kwargs = kwargs.copy()
estimator_kwargs.pop('n_bootstraps', None)
...
self.estimator[estimator] = BAR(**kwargs).fit(u_nk)
i.e., modify the copy of kwargs as needed but keep everything in kwargs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, what do you mean this will fail?
>>> kwargs = {"n_bootstraps": 0}
>>> estimator_kwargs = {"n_bootstraps": 50, **kwargs}
>>> print(estimator_kwargs)
{'n_bootstraps': 0}
Gives the expected behaviour where kwargs overrides the default in estimator_kwargs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I am a bit surprised. (I was in the process of trying it out but then wrote the review comment without finishing my testing.)
My concern remains that the default ought to be n_bootstraps=0 to initially have the same behavior.
src/alchemlyb/workflows/abfe.py
Outdated
"By default, n_bootstraps=50 is used to estimate the " | ||
"MBAR error. Supply n_bootstraps=0 to use analytic error." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not change the default without a deprecation. I'd issue a DeprecationWarning to state that the error estimate will change in release 2.3 to n_bootstraps=50.
Until then, keep
kwargs.setdefault('n_bootstraps', 0)
to keep the old behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also use a more informative status message (see other comment) that reports on n_bootstrap.
Use a DeprecationWarning for letting users know about coming changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think for this PR I will just add a DeprecationWarning without changing the behaviour.
@@ -20,6 +20,7 @@ The rules for this file: | |||
Enhancements | |||
- Add a parser to read serialised pandas dataframe (parquet) (issue #316, PR#317). | |||
- workflow.ABFE allow parquet as input (issue #316, PR#317). | |||
- Allow MBAR estimator to use bootstrap to compute error (issue #320, PR#322). | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a deprecation for using analytical error estimate as the default in ABFE, will change to using 50 bootstraps in 2.3 (see other comments)
ad50299
to
9cf90c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor things
CHANGES
Outdated
|
||
DeprecationWarning | ||
- The default MBAR error estimator in workflow.ABFE.estimate will change from | ||
analytic to bootstrap=50 (issue #320, PR#322). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
say when it will change (target release, current + 2 = 2.3 I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or 2.2, if you want to be aggressive, but state the target release.
src/alchemlyb/workflows/abfe.py
Outdated
@@ -403,6 +403,9 @@ def estimate(self, estimators=("MBAR", "BAR", "TI"), **kwargs): | |||
'MBAR']. Note that the estimators are in their original form where | |||
no unit conversion has been attempted. | |||
|
|||
.. versionchanged:: 2.1.0 | |||
DeprecationWarning for using analytic error for MBAR estimator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
@@ -67,13 +73,15 @@ def __init__( | |||
relative_tolerance=1.0e-7, | |||
initial_f_k=None, | |||
method="robust", | |||
n_bootstraps=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a reminder comment
n_bootstraps=0, # release 2.2: change to 50 (see PR #322)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my feeling is that for the MBAR estimator, we retain the current behaviour unless people manually turn it on as bootstrap does come with a computational cost.
It is only for the estimate method in workflow.ABFE where I know that the input is decorrelated and I'm prepared to pay the computational cost, I will set the default to 50.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's ok, we just can't make the default behave differently without a minimal warning period.
Implement #320