Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add save/load #16

Closed
olliesilvester opened this issue Sep 14, 2023 · 21 comments · Fixed by #697
Closed

Add save/load #16

olliesilvester opened this issue Sep 14, 2023 · 21 comments · Fixed by #697
Assignees
Labels
design Discussion about the internal design of the repo enhancement New feature or request
Milestone

Comments

@olliesilvester
Copy link
Contributor

As a follow up to bluesky/ophyd-epics-devices#28, a method should be added to save all PV's of a device to a yaml file, and a load method should be added to set the device to a state specified in the yaml file.

@olliesilvester olliesilvester self-assigned this Sep 14, 2023
@rosesyrett rosesyrett added the enhancement New feature or request label Sep 14, 2023
@callumforrester callumforrester added the design Discussion about the internal design of the repo label Feb 13, 2024
@callumforrester
Copy link
Contributor

Co-opting this into a design issue. Writeup of a discussion with @coretl, @tpoliaw and @abbiemery:

Malcolm's current setup is a large bank of settings saved to a yaml file in order and restored in reverse order. Then on top of that, particular per-device logic may be invoked that sets specific signals in a specific order. In other words a broad-stoke followed by a fine-grained approach where needed. This is useful for approximately capturing the desired state without too much code overhead, but it must always be best-effort as the starting state is unknown. It also presents difficulties if the user wishes to leave a particular setting untouched, perhaps because they are tweaking it manually and experimenting. The question is how close ophyd-async's load/save is to malcolm's

There was some discussion about an inverse system: rather than capture all possible state only add explicit logic to set the state we know we care about i.e. state that makes a difference to the scan we want to run. Over time as edge-cases are discovered that add more constraints on the state, we add these to the plan/prepare method/yaml file/whatever is responsible for defining the state. The principal disadvantage of this is it is extra code overhead that must be manually maintained, but hopefully that maintenance is spread out over time.

There was also discussion about decoupling load/save from flyers, so we provide a default implementation that they can use but don't have to, leaving open the possibility of others writing their own version that looks different or not using load/save at all.

@coretl
Copy link
Collaborator

coretl commented Feb 20, 2024

We talked a lot about requirements that we can't currently check, like "Eiger trigger connected to PandA1 TTLOUT1", that would enable checks like "Sequencer has some connection to TTLOUT1". This is a good idea, but can be put in as a future enhancement.

The decision we need to make now is how to get from "I've poked my EPICS system into a working state and it runs a scan" to "my scan runs when the beamline has been doing something else". There are some options for this:

  1. Change settings in Python, either in the plan or in the Device
  2. Have an intelligent save, that structures the YAML file into phases, and excludes things based on the plan that will be running
  3. Have an intelligent load, that takes a blob of settings as requirements, and knows how to apply them to the device

We were going for 2. because of ease of saving and re-saving the same things, but I'm wondering now if moving towards 3. would be better. This would mean that we can do a generic "save_device" plan with no saving into phases, but have an intelligent "load_device" that would only put the signals that had changed. We could still do user ignoring of specific parameters in the save file, but the saved file looks a lot more like the requirements file with the trigger wiring in it, which moves us in the right direction

@olliesilvester
Copy link
Contributor Author

olliesilvester commented Feb 20, 2024

If you went for intelligent load, it would be useful for the generic save store a list of settings which still need changing after the load, things like the panda sequencer table's values.

Also, if it's helpful for discussion, Hyperion is currently using option 2 in https://github.com/DiamondLightSource/hyperion/blob/697cfffd78481312cccf4baad0c0358b560a54da/src/hyperion/device_setup_plans/setup_panda.py#L106

@callumforrester
Copy link
Contributor

I think we should do because 1. is the most flexible option, but I also think it does not exclude 2. or 3. If we want, 3. we can just we the Python function that, in practice, is used to change the settings in most cases, but the door is still left open for special cases.

@coretl 3. looks interesting but I'm not sure how it would handle setting up a device differently for different types of scan?

@coretl
Copy link
Collaborator

coretl commented Jun 14, 2024

Discussions with @callumforrester @olliesilvester @dperl-dls lead me to think a hybrid of 1 and 3 makes the most amount of sense:

  • Provide a generic save_device_yaml(device, yaml_path) plan that write all Signals (even read-only ones) to a YAML file
  • This is used as a baseline for a working Device, regardless of settings
  • Provide a series of setup_x_to_do_y(device, yaml_path, ...) intelligent load plans that will take the current device Signals, the baseline values, and a set of programmatically defined values, and apply them to the device in the right order, avoiding unnecessary puts

For example:

@dataclass
class ChangeSet:
    current: Dict[SignalW[T], T]
    required: Dict[SignalW[T], T]
    
    def require(self, signal: SignalW[T], value: T):
        self.required[signal] = value

    def changes(self) -> Dict[SignalW[T], T]:
        return {signal: value for signal, value in self.required.items() if self.current[signal] != value}


def changeset_to_restore(baseline: Dict[SignalW[T], T]) -> ChangeSet:
    current_values = yield ("locate", *baseline) 
    return ChangeSet(
        current = {signal: location["setpoint"] for signal, location in zip(baseline, current_values)
        required = baseline
    )

def write_changes_to_device(changes: Dict[SignalW[T], T]):
    group = uid()
    for signal, value in changes.items():
        yield from abs_set(signal, value, group=group)
    yield from wait(group)

def write_changes_to_panda(changes: Dict[SignalW[T], T]):
    # Remove the units from the changes dict as they need to be done first
    units = {signal: changes.pop(signal) for signal in changes if signal.name.endswith("units")}
    yield from write_changes_to_device(units)
    # Do everything else
    yield from write_changes_to_device(changes)

def setup_panda_for_repeated_triggers(baseline: Dict[SignalW[T], T], seq: SeqBlock, repeated_trigger_info):
    changeset = yield from changeset_to_restore(baseline)
    changeset.require(seq.table, repeated_trigger_info.generate_table())
    changeset.require(seq.repeats, 1)
    changeset.require(seq.enable, 0)
    yield from write_changeset_to_panda(changeset.changes())

Thoughts?

@callumforrester
Copy link
Contributor

...maybe...

The issue, I think, is that this is a best-effort solution, meaning you have to make sure it doesn't cause more problems than it solves. The way I see it there are a few possible cases:

  1. The user is unaware that restoring saved state would overwrite a PV they have deliberately tweaked
  2. The user is unaware that tweaking some not-directly-related PVs will cause problems with their measurement
  3. The user is unaware that a PV they expect to be restored is actually ignored
  4. The user is unaware that the beamline is in the wrong state when they save
  5. A special procedure is required to get to the desired state (i.e. more logic than just load())
  6. ...I've probably missed some

What I want is a pie chart showing how common each case is, so we know where best to direct the best-effort.

@coretl
Copy link
Collaborator

coretl commented Jun 18, 2024

...maybe...

The issue, I think, is that this is a best-effort solution, meaning you have to make sure it doesn't cause more problems than it solves. The way I see it there are a few possible cases:

1. The user is unaware that restoring saved state would overwrite a PV they have deliberately tweaked

2. The user is unaware that tweaking some not-directly-related PVs will cause problems with their measurement

3. The user is unaware that a PV they expect to be restored is actually ignored

4. The user is unaware that the beamline is in the wrong state when they save

5. A special procedure is required to get to the desired state (i.e. more logic than just `load()`)

6. ...I've probably missed some

What I want is a pie chart showing how common each case is, so we know where best to direct the best-effort.

We cannot make a pie chart yet because we do not have an exhaustive save mechanism at present. We have GDA, which gives us situation 2 quite a lot when someone prods an areaDetector PV that GDA is not setting and breaks a scan. We have Malcolm which gives us 1 when we switch to using a saved design for a scan for the first time, 3 because the set of PVs to be restored is not exhaustive. I suspect that we also get 4 with Malcolm, because it is easy to do with a PandA, and the saved design for PandA is used in many scan designs. We have never encountered 5 yet, because the logic has been wrapped up in the scan phase, not the load phase.

I suggest we widen the discussion to @tomtrafford and @EdWarrick to see if they have any experience of which is more common and where we should be putting our effort

@olliesilvester
Copy link
Contributor Author

olliesilvester commented Jun 18, 2024

Thoughts?

This seems good to me, at least for a PandA. Is it up to a user to get their baseline for a device? To me, there are two separate baselines.

  1. Reset device to factory default settings
  2. Reset device to a plan-specific state. Eg, configure and wire up a PandA's blocks, but don't touch things that need to be changed on every collection, like the sequencer table's values.

@coretl
Copy link
Collaborator

coretl commented Jun 18, 2024

Thoughts?

This seems good to me, at least for a PandA. Is it up to a user to get their baseline for a device? To me, there are two separate baselines.

1. Reset device to factory default settings

2. Reset device to a plan-specific state. Eg, configure and wire up a PandA's blocks, but don't touch things that need to be changed on every collection, like the sequencer table's values.

Yes, I imagine that you poke a PandA til the plan works, then save a baseline, then incorporate it into a setup_* plan stub like above. I also imagine we would only store baselines for plan specific state.

@coretl
Copy link
Collaborator

coretl commented Jun 18, 2024

There's one other option we didn't think of, we could always do the generic save part for the scan, emit it as part of the document stream, but never actually do the load from yaml file part. Then we would only programatically change settings by writing it in Python, but never actually load a yaml file, however we could always look back on a working scan and diff the two settings to see if we can work out what changed.

I know @callumforrester initially favoured this route, but I don't know how well it works for PandA

@olliesilvester
Copy link
Contributor Author

There are so many settings you can change and forget about on the PandA that I think doing a load at some point is very useful, especially if you've been playing around with the config on the web gui

@callumforrester
Copy link
Contributor

Looking at these again

  1. The user is unaware that restoring saved state would overwrite a PV they have deliberately tweaked
  2. The user is unaware that tweaking some not-directly-related PVs will cause problems with their measurement

@olliesilvester I agree with you if 2. is more common than 1. because then the precise state of all the possibly-related minutiae will be restored and the user doesn't have to worry. If 1 is more common then restoring a blanket state creates more problems than it solves.

@coretl coretl added this to the 1.0 milestone Jul 19, 2024
@coretl
Copy link
Collaborator

coretl commented Nov 27, 2024

This is the last remaining breaking change for 1.0 so I will start work on this now. The general consensus is that we cannot work out if 1 or 2 is more common, so we will give users the option to choose and make a system that works for both.

The plan:

  • Add a class Settings which is essentially a Dict[SignalRW, Any] with typed __setitem__, and the ability to | them with more settings
  • Add a plan settings_from_device(device: Device) -> MsgGenerator[Settings] to walk any device for SignalRWs and locate all SignalRWs from it
  • Add a plan settings_to_yaml(device: Device, yaml_path: str) -> MsgGenerator[None] that uses the above to store those settings to a YAML file
  • Add a function settings_from_yaml(device: Device, yaml_path: str) -> Settings that loads a YAML file for values, walks a Device for SignalRWs, and creates a Settings object from them
  • Add a plan settings_to_change(device: Device, settings: Settings) -> MsgGenerator[Settings] that discards settings that are already at the right value, erroring if they aren't in the Device
  • Add per-device plans apply_settings_to_x(device: Device, settings: Settings) -> MsgGenerator[Settings] that calls settings_to_change and sets the signals that come back in the right phases, then returns the settings that would be required to restore it
  • Add utility functions x_settings(...) -> Settings that make settings to do particular jobs

Put it all together for a PandA example (opt-out):

# Interactively
RE(settings_to_yaml(panda, "/path/to/panda1_time_based.yaml"))
# Then later on
panda1_time_based = settings_from_yaml(panda, "/path/to/panda1_time_based.yaml")

def my_plan():
    settings = (
        panda1_time_based
        | time_based_seq_settings(panda.seq1, ...)
        | repeated_pulse_settings(panda.pulse1, ...)
    )
    yield from apply_settings_to_panda(panda1, settings)
    yield from some_flyscan(panda, dets, ...)

Or in an opt-in example like an areaDetector we don't do the initial settings_from_yaml, but we compose the Settings from scratch

@coretl
Copy link
Collaborator

coretl commented Nov 27, 2024

Comments:

  • Make sure the YAML file only has relative signal paths

@coretl coretl assigned coretl and unassigned olliesilvester Nov 27, 2024
@callumforrester
Copy link
Contributor

A minor suggestion, it's relatively easy to avoid locking yourself into yaml as a storage format in case you change your mind later.

RE(store_settings(settings, backend=Yaml("/path/to/file.yaml")))

# or 

set_settings_backend(type=YAML, path_template="path/to/%s.yaml")
RE(store_settings(settings)))

There are other ways of doing it but you get the idea

@coretl
Copy link
Collaborator

coretl commented Nov 28, 2024

Hmm, maybe we need a SettingsProvider...

@coretl
Copy link
Collaborator

coretl commented Nov 29, 2024

More discussions with @callumforrester where we went back and forth between plans and device verbs. It seemed too difficult to get everything in prepare as a plan, especially as we want to avoid attaching plans to Devices, so the current proposal is:

  • TriggerInfo will have a settings: Settings attribute, with settings for the entire PandA or Detector
  • TriggerInfo will also have a put_back: bool attribute, saying whether to put it back the way it was found before prepare
  • In StandardDetector.prepare:
    • It will check that all settings have self somewhere in the parent tree
    • DetectorWriter.partition_settings(settings: Settings) -> tuple[Settings, Settings] will split into settings it will handle, and settings the controller should handle
    • DetectorWriter.open will be split into prepare(trigger_info: TriggerInfo) -> Sequence[Settings] and describe() -> dict[str, DataKey]
    • DetectorWriter.prepare and DetectorController.prepare will be passed a TriggerInfo with their own Settings, and will be responsible for setting their own settings in the right order, using Settings.apply(). The returned Settings from each apply are the original values of the signals, and should be returned from the prepare function.
  • In StandardDetector.unstage, if put_back then call Settings.apply() on those returned functions
  • There will be a plan stub store_settings(provider: SettingsProvider, name: str, device: Device) for storing settings from an existing device
  • There will be a function retrieve_settings(provider: SettingsProvider, name: str, device: Device) -> Settings for retrieving settings that have been stored

@callumforrester
Copy link
Contributor

More thoughts:

  • Presumably partition_settings is only an optimization? Without it, Setting.apply might set the file path to something superfluous, and then the actual logic in prepare would set it again. The partition just allows us to save a useless put, but it never affects functionality (right?)
  • Why are we only partitioning on the writer?
  • This should work if there are no settings, and that should be the default case, that way we can onboard new users gradually, first getting them to set up their detectors manually and then later introducing the concept of automated loading/saving of state
  • Why is retrieve_settings a function rather than a plan stub?

@coretl
Copy link
Collaborator

coretl commented Dec 2, 2024

  • Presumably partition_settings is only an optimization? Without it, Setting.apply might set the file path to something superfluous, and then the actual logic in prepare would set it again. The partition just allows us to save a useless put, but it never affects functionality (right?)
  • Why are we only partitioning on the writer?

Partition settings is there to divide the settings between the ones the writer should handle (just the ones for the writer) and the ones the controller should handle (all the rest). Without this we might not only get a useless put, but the useless one might actually happen after the one we really wanted.

  • This should work if there are no settings, and that should be the default case, that way we can onboard new users gradually, first getting them to set up their detectors manually and then later introducing the concept of automated loading/saving of state

Yes

  • Why is retrieve_settings a function rather than a plan stub?

Because it doesn't do any Device I/O, only file I/O

@coretl
Copy link
Collaborator

coretl commented Dec 2, 2024

Another discussion with @callumforrester and we're preferring something that looks more like #16 (comment)

The issues with the prepare(Settings) method:

  • It adds overhead to the first prepare method called which has to do load as well as the other things it usually does
  • It means it's difficult to call load as a separate plan, then a bunch of flyscans, then known_good_state, as known_good_state might not want to call prepare

I tried to go right back the other way to a barebones prepare method and everything in plans, but it was difficult to do that while still keeping a bare count method working. We still need the prepare(N), [kickoff, complete]xN pattern combined with moving to the next outer scan point to make sure that before each fly segment we can checkpoint having produced all the frames of the previous segment.

I think we still prefer an apply_settings_to_panda(panda, settings) as it is easier to swap out for a custom case if we want. We can add settings for excluding fields by setting them to None as part of the |. It also means that we can add the excludes gradually when they annoy us rather than forcing the detector to know about them from the start.

@jwlodek
Copy link
Member

jwlodek commented Dec 2, 2024

Another discussion with @callumforrester and we're preferring something that looks more like #16 (comment)

The issues with the prepare(Settings) method:

* It adds overhead to the first `prepare` method called which has to do load as well as the other things it usually does

* It means it's difficult to call `load` as a separate plan, then a bunch of flyscans, then `known_good_state`, as `known_good_state` might not want to call `prepare`

I tried to go right back the other way to a barebones prepare method and everything in plans, but it was difficult to do that while still keeping a bare count method working. We still need the prepare(N), [kickoff, complete]xN pattern combined with moving to the next outer scan point to make sure that before each fly segment we can checkpoint having produced all the frames of the previous segment.

I think we still prefer an apply_settings_to_panda(panda, settings) as it is easier to swap out for a custom case if we want. We can add settings for excluding fields by setting them to None as part of the |. It also means that we can add the excludes gradually when they annoy us rather than forcing the detector to know about them from the start.

I think the approach in #16 (comment) is what I'd prefer as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Discussion about the internal design of the repo enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants