Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable MBAR to do bootstrap error estimation #322

Merged
merged 11 commits into from
Jun 6, 2023

Conversation

xiki-tempula
Copy link
Collaborator

Implement #320

@xiki-tempula xiki-tempula linked an issue Jun 3, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Jun 3, 2023

Codecov Report

Merging #322 (b887659) into master (ef74784) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #322   +/-   ##
=======================================
  Coverage   98.75%   98.75%           
=======================================
  Files          27       27           
  Lines        1762     1767    +5     
  Branches      388      389    +1     
=======================================
+ Hits         1740     1745    +5     
  Misses          2        2           
  Partials       20       20           
Impacted Files Coverage Δ
src/alchemlyb/estimators/mbar_.py 100.00% <100.00%> (ø)
src/alchemlyb/workflows/abfe.py 99.67% <100.00%> (+<0.01%) ⬆️

@xiki-tempula xiki-tempula requested a review from orbeckst June 3, 2023 20:55
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good, by under semver we can't just change the way that ABFE computes the error. We need to deprecate the default (analytical) and then switch over to n_bootstraps=50 in, say, release 2.3.

The kwargs handling of n_bootstraps needs improving (and possibly a test — sorry).

src/alchemlyb/workflows/abfe.py Outdated Show resolved Hide resolved
"By default, n_bootstraps=50 is used to estimate the "
"MBAR error. Supply n_bootstraps=0 to use analytic error."
)
estimator_kwargs = {"n_bootstraps": 50, **kwargs}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail when I actually supply n_bootstraps as a kwarg because then the dict contains the same key twice ... a test would have caught it.

Instead, I would do

# use 0 as default to remain backwards-compatible
# change in release 2.3
kwargs.setdefault('n_bootstraps', 0)
...
...
if estimator == "MBAR":
  estimator_kwargs = kwargs
  self.logger.info("Run MBAR estimator")
  self.logger.info("Estimating MBAR error with n_bootstraps={n_bootstraps} (0 is analytical estimate)", estimator_kwargs)
elif estimator == "BAR":
  estimator_kwargs = kwargs.copy()
  estimator_kwargs.pop('n_bootstraps', None)
  ...
  self.estimator[estimator] = BAR(**kwargs).fit(u_nk)

i.e., modify the copy of kwargs as needed but keep everything in kwargs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what do you mean this will fail?

>>> kwargs = {"n_bootstraps": 0}
>>> estimator_kwargs = {"n_bootstraps": 50, **kwargs}
>>> print(estimator_kwargs)
{'n_bootstraps': 0}

Gives the expected behaviour where kwargs overrides the default in estimator_kwargs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I am a bit surprised. (I was in the process of trying it out but then wrote the review comment without finishing my testing.)

My concern remains that the default ought to be n_bootstraps=0 to initially have the same behavior.

Comment on lines 448 to 449
"By default, n_bootstraps=50 is used to estimate the "
"MBAR error. Supply n_bootstraps=0 to use analytic error."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not change the default without a deprecation. I'd issue a DeprecationWarning to state that the error estimate will change in release 2.3 to n_bootstraps=50.

Until then, keep

kwargs.setdefault('n_bootstraps', 0)

to keep the old behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also use a more informative status message (see other comment) that reports on n_bootstrap.

Use a DeprecationWarning for letting users know about coming changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think for this PR I will just add a DeprecationWarning without changing the behaviour.

@@ -20,6 +20,7 @@ The rules for this file:
Enhancements
- Add a parser to read serialised pandas dataframe (parquet) (issue #316, PR#317).
- workflow.ABFE allow parquet as input (issue #316, PR#317).
- Allow MBAR estimator to use bootstrap to compute error (issue #320, PR#322).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a deprecation for using analytical error estimate as the default in ABFE, will change to using 50 bootstraps in 2.3 (see other comments)

@xiki-tempula xiki-tempula force-pushed the 320-enable-bootstrap branch from ad50299 to 9cf90c7 Compare June 6, 2023 19:44
@xiki-tempula xiki-tempula requested a review from orbeckst June 6, 2023 20:33
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor things

CHANGES Outdated

DeprecationWarning
- The default MBAR error estimator in workflow.ABFE.estimate will change from
analytic to bootstrap=50 (issue #320, PR#322).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say when it will change (target release, current + 2 = 2.3 I think)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or 2.2, if you want to be aggressive, but state the target release.

@@ -403,6 +403,9 @@ def estimate(self, estimators=("MBAR", "BAR", "TI"), **kwargs):
'MBAR']. Note that the estimators are in their original form where
no unit conversion has been attempted.

.. versionchanged:: 2.1.0
DeprecationWarning for using analytic error for MBAR estimator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

@@ -67,13 +73,15 @@ def __init__(
relative_tolerance=1.0e-7,
initial_f_k=None,
method="robust",
n_bootstraps=0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a reminder comment

 n_bootstraps=0,    # release 2.2: change to 50 (see PR #322)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my feeling is that for the MBAR estimator, we retain the current behaviour unless people manually turn it on as bootstrap does come with a computational cost.

It is only for the estimate method in workflow.ABFE where I know that the input is decorrelated and I'm prepared to pay the computational cost, I will set the default to 50.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's ok, we just can't make the default behave differently without a minimal warning period.

@xiki-tempula xiki-tempula requested a review from orbeckst June 6, 2023 21:22
@xiki-tempula xiki-tempula merged commit ae7f40b into master Jun 6, 2023
@xiki-tempula xiki-tempula deleted the 320-enable-bootstrap branch June 6, 2023 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enable bootstrap
2 participants