-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Looping operations and subgraph management #205
Comments
Peter had offered the following pseudocode.
I think the intention would be to run a bunch of simulations, find the most distinct conformations, then use those ('-clndx' or '-cl' option) to launch another set of simulations, repeating until something about the |
We can do clever tricks to make Python Proposed syntax for Peter's example, assuming all of the frames from all of the trajectories are clustered. I don't have a clear notion of how an operation might change width during execution, so assume we run the same number of simulations in each loop, and that we have enough clusters to do so. # Add a TPR-loading operation to the default work graph (initially empty) that produces
# the standard simulation input data bundle (parameters, structure, topology, state)
initial_input = gmx.load_tpr([file1, file2])
# Get a placeholder object that can serve as a sub context / work graph owner,
# and which can be used in a control operation.
subgraph = gmx.subgraph(input={'conformation': initial_input})
# As an alternative to specifying the context or graph in each call, intuiting the graph,
# or requiring the user to manage globally switching, we could use a context manager.
with subgraph:
modified_input = gmx.modify_input(input=initial_input, structure=subgraph.input.conformation)
md = gmx.mdrun(input=modified_input)
# Assume the existence of a more integrated gmx.trajcat operation
cluster = gmx.command_line('gmx', 'cluster', input=gmx.reduce(gmx.trajcat, md.output.trajectory))
condition = mymodule.cluster_analyzer(input=cluster.output.file["-ev"])
subgraph.next_input.conformation = cluster.output.conformation
# In the default work graph, add a node that depends on `condition` wraps subgraph.
# It makes sense that a convergence-checking operation is initialized such that
# `is_converged() == False`
my_loop = gmx.while(gmx.logical_not(condition.is_converged), subgraph)
gmx.run() I think we should consider what this would look like in TensorFlow, though. I think the subgraph would be a "layer" and the inputs / next_input would be Variable(s). Along with some aspect of the |
Suggest a modification as follows:
|
In the above comment from @peterkasson , it is up to the implementation of |
Question about step number next:Say we were doing adaptive MSMs weighting by uncertainty rather than restarting uniformly across ensemble members. Then we would not say |
Clarification to @jmhays' comment: in For convenience, I have been neglecting the additional layer of Comment: I don't think that we should allow an assignment of the form Maybe replace with |
I think the answer to "Question about step number next" is that A raw implementation of such an Operation and helper function will be clearer as I finish up #85, but the plan ought to be to make it easy to generate appropriate wrappers. This is the set of updates that we've been talking about for the C++ plugin development environment as well as probably a Python superclass or helper function that would allow you to declare (a) something stateful, (b) named input hooks, (c) named output hooks, and connect them with whatever calculation you need. I'll follow up with a gist or something for a possible example. |
Here Eric, let's try this. I'll write the code and you can tell me what I'm doing wrong. import gmx
# other imports
class cluster_analyzer(gmx_analyzer):
""" Analyzes results of gmx cluster command line tool """
def __init__(self, input: list):
super().__init__()
self.input = input # List of files to do things with
def other_function(self):
return other_stuff
def is_converged(self):
""" Calculates relative entropy of two MSMs """
# TODO: Relative entropy calc'n
return relative_entropy < tolerance |
I assume there is a typo in your init definition. I was thinking that So the simplest change to what you've written would probably be to define a class and use a helper from the I don't know how quickly I can come up with a plausible from-scratch implementation since I still haven't finished designing the solutions to #85 or #190 The trick is that we need |
I don't think this is what you're looking for, but just as a reminder, the way gmxapi operations are currently defined and launched is by creating a gmx.workflow.WorkElement and by defining a Context operation in the Context itself. I've been working on making this more modular for a while, but it is an ongoing process. Right now, operations like 'md' and 'load_tpr' are defined in a map ( https://github.com/kassonlab/gmxapi/blob/master/src/gmx/context.py#L698 ) that is initialized with some static internal function mappings ( https://github.com/kassonlab/gmxapi/blob/master/src/gmx/context.py#L26 and https://github.com/kassonlab/gmxapi/blob/master/src/gmx/context.py#L59 ) , extensible with Context.add_operation, and with a protocol for automatic extensibility that is contained in Context.enter ( https://github.com/kassonlab/gmxapi/blob/master/src/gmx/context.py#L761 ). Solidifying these protocols are the subjects of several issues, particularly in So that is how to add Python modules to gmxapi in 0.0.7, but not what I want for #190 or by the time #205 is resolved. |
Refinements: We would like to use Python context managers ( We can use the TensorFlow Variable scoping as examples, but we might find more user-friendly syntax.
|
relates to #190
relates to #84
Control flow vs. data flow
Branching and conditional logic could be handled in gmxapi as data flow, either by
We already do a bit of (2) with the simulation stop condition that can be issued by an MD plugin. We have not needed to allow operations to rewrite their encapsulating work graph (1) yet, and to do so would make scheduling and execution graph management harder, so I think we should avoid that if we can. But allowing rewrite or optional execution of nested work graphs seems straightforward.
Looping operations, like
for
orwhile
map well to nested work graphs, conceptually. Since work graphs are representable in a data structure that is compatible with the genericparams
dictionary in workspec 1, we could put entire work graphs in the parameters for new control operations. The representation would need to be updated for workspec 2, but we should be able to make a user interface that could stay the same.The text was updated successfully, but these errors were encountered: