-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
steps
keyword argument has surprising/ambiguous behavior
#130
Comments
Make sure session close() results in verifiably finalized log file. Make sure to propagate session closure down through API. * Destroy the gmx::Mdrunner in Session::close(). * Flush fplog file descriptor before returning control to API after mdrunner(). * When fplog managed by runner is closed, set the FILE* pointer to null. * clean up tests and handle the caveat also described in kassonlab/gmxapi#130
Relates to issue kassonlab#40 and to bug kassonlab#130 Resolves task in kassonlab#130 when merged.
Make sure session close() results in verifiably finalized log file. Make sure to propagate session closure down through API. * Destroy the gmx::Mdrunner in Session::close(). * Flush fplog file descriptor before returning control to API after mdrunner(). * When fplog managed by runner is closed, set the FILE* pointer to null. * clean up tests and handle the caveat also described in kassonlab/gmxapi#130
I think here we should replicate the normal mdrun behavior: if -nsteps is not specified, it uses the number of steps in the TPR (counting a checkpoint restart as a continuation). Appending or not doesn't affect this. This should be the desired gmxapi behavior also. |
I agree that unspecified steps should advance the trajectory to the point specified in the TPR file. Right now it doesn't do that if there is a checkpoint that has gone past the step specified in the TPR file. There are at least two cases, it seems, where the command line behavior can lead to infinite trajectories, and at least in the case described here, it seems like unintended behavior that could cause a user to accidentally burn up a lot of SUs, so I want to address this potentially surprising behavior and at least make sure it is well documented to disambiguate it a bit. I think this is achieved in 2be078d Setting an infinite number of steps could effectively be the way to replace the nsteps stop condition with some other, but it shouldn't be accidental the way it currently is. I don't think this behavior is affected by whether or not appending is happening or not. I was thinking that appending could not be enabled for the workaround of removing the checkpoint file, but it may not be necessary to explicitly turn it off. Note, though, that for automatic fault tolerance and reproducibility, we always restart from a checkpoint if one is found and we append to output files by default, but we don't have full workflow checkpointing yet, so these are implementation details that are very much open for discussion. But they are also likely to evolve naturally as we get better handling of simulation inputs and outputs, both with #79 and others. For more on appending / not appending to existing files, see #40 and #126 |
Several aspects of simulation continuation behavior were recently declared to be GROMACS bugs (see https://redmine.gromacs.org/issues/2717) and behavior has changed in recent GROMACS patches. We need to check again that our behavior is sensible and accurately documented. See, for example, https://gerrit.gromacs.org/c/8673/4 and https://gerrit.gromacs.org/c/8618/ |
resolves kassonlab#175 Behavior is well-defined as of GROMACS 2019 beta 3. If additional documentation updates are necessary, we will address them in a separate issue.
It looks like we should be fine in our tests and everything since we rely on `end_time` instead of `steps`, but upstream changes to GROMACS may cause a behavior change in GROMACS 2019 for simulations that use the argument to override the number of steps specified in the TPR file. The command line `nsteps` argument will be removed in GROMACS master after the GROMACS 2019 branch, so we need to prepare for removal of our version of it. Now issue a deprecation warning when "steps" or "nsteps" is specified to `gmx.workflow.from_tpr()` Resolves kassonlab#130
Initial support for the
steps
keyword argument ingmx.workflow.from_tpr()
has potentially surprising behavior (and will need to exhibit different behavior in the future as noted in #56 ).Namely, not specifying
steps
for a simulation that is using a previous simulation's output as input can result in a target step of infinity. (Note: forgmx mdrun
,-nsteps -1
is described as having this behavior, and only-1
.) This is potentially more surprising to a user because-cpi
is implicitly set for MD operations in gmxapi 0.0.6 in an attempt to have consistent behavior after an interruption and to make it easier to continue a trajectory in aI'm not sure if it is worth a complete resolution now, given longer-term plans. The workaround is to use a clean working directory, or at least to remove
state.cpt
and setappend=False
, and to be aware of the semantics.gmx.workflow.from_tpr()
steps
keyword argument.The text was updated successfully, but these errors were encountered: