-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A VMM in state 'starting' can receive requests #7398
base: main
Are you sure you want to change the base?
A VMM in state 'starting' can receive requests #7398
Conversation
If a VMM is in state 'starting' and cannot activate a given disk, it will hang. Region replacement or region snapshot replacement may alter the VCR of that disk to one that *could* be activated, but the current drive saga will not attempt to send requests to a VMM in 'starting'. Fix this - any time a Propolis is expected to be there, it should be ok to receive these requests. Otherwise a repair can be stuck, and the VMM could fail to stop if it gets stuck deactivating too!
Have we verified empirically that region snapshot/replacement work as intended on a Propolis that hasn't yet reached Running? I'm not certain they will (and would need to go look carefully at the relevant Propolis code to check). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @gjcolombo --- it's worth making sure that propolis-server
can actually handle region replacement requests in that state.
| VmmState::Rebooting | ||
| VmmState::Starting => { | ||
// Propolis server is expected to be there | ||
// (eventually, in the case of "Starting"), and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that a VMM isn't in the Starting
state unless the Propolis server process does exist, FWIW --- if memory serves, that's the distinction between Starting
and Creating
. It may not yet be incarnating an instance, though.
I looked at this a bit. In Propolis there are at least two things we need to consider:
It might be possible to fix the latter issue, but it's probably going to take a fair amount of effort on the Propolis side of the house. (We'd also need to be sure that if we have a Crucible upstairs that's stuck in a |
Filed oxidecomputer/propolis#841 to track the relevant Propolis enhancement. |
If a VMM is in state 'starting' and cannot activate a given disk, it will hang. Region replacement or region snapshot replacement may alter the VCR of that disk to one that could be activated, but the current drive saga will not attempt to send requests to a VMM in 'starting'. Fix this - any time a Propolis is expected to be there, it should be ok to receive these requests. Otherwise a repair can be stuck, and the VMM could fail to stop if it gets stuck deactivating too!