Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nexus::select_runtime_change_action's final match needs tuning #6390

Open
gjcolombo opened this issue Aug 19, 2024 · 0 comments
Open

Nexus::select_runtime_change_action's final match needs tuning #6390

gjcolombo opened this issue Aug 19, 2024 · 0 comments
Labels
nexus Related to nexus virtualization Propolis Integration & VM Management

Comments

@gjcolombo
Copy link
Contributor

Specifically this bit:

};
// The instance has an active sled. Allow the sled agent to decide how
// to handle the request unless the instance is being recovered or the
// underlying VMM has been destroyed.
//
// TODO(#2825): Failed instances should be allowed to stop. See above.
let allowed = match requested {
InstanceStateChangeRequest::Run
| InstanceStateChangeRequest::Reboot
| InstanceStateChangeRequest::Stop => match effective_state {
InstanceState::Creating
| InstanceState::Starting
| InstanceState::Running
| InstanceState::Stopping
| InstanceState::Stopped
| InstanceState::Rebooting
| InstanceState::Migrating => true,
InstanceState::Repairing | InstanceState::Failed => false,
InstanceState::Destroyed => false,
},
InstanceStateChangeRequest::Migrate(_) => match effective_state {
InstanceState::Running
| InstanceState::Rebooting
| InstanceState::Migrating => true,
InstanceState::Creating
| InstanceState::Starting
| InstanceState::Stopping
| InstanceState::Stopped
| InstanceState::Repairing
| InstanceState::Failed
| InstanceState::Destroyed => false,
},

This match is a little hard to parse. At this point, Nexus already believes the instance of interest has a running Propolis and is just deciding whether to send a state change request there for disposition. I think the general idea should be to say something like

  • If this is a request to start, reboot, or stop, and the Propolis is in a non-terminal state (i.e. it's not Failed or Destroyed), and it doesn't look like we're migrating, forward the request to Propolis. (We don't currently transmit the state change request queue from source to target during a migration, so allowing a reboot/stop request on a migrating instance might result in a request being queued to the source that won't be picked up by the target.)
  • If this is a request to start Propolis via migration in, and the target Propolis hasn't reported that it's started migrating yet, permit the request; otherwise the VMM has already started and there's nothing left to do

For start/stop/reboot, this is pretty close to what we have today, but the handling for requests to migrate could stand to be tightened up a bit.

It's also worth noting that this match is in a path where we already know that we've got an active Propolis. We should consider whether it'd be clearer to look at the VMM's state directly instead of looking at it as interpreted through InstanceAndActiveVmm::determine_effective_state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nexus Related to nexus virtualization Propolis Integration & VM Management
Projects
None yet
Development

No branches or pull requests

1 participant