want Crucible VCR replacement to be able to proceed before a VM has fully started #841
Labels
server
Related specifically to the Propolis server API and its VM management functions.
Milestone
See oxidecomputer/omicron#7398 for more background.
Background
Instances with Crucible disks need to "activate" their connections to their Crucible downstairs servers before those servers will accept I/O requests. Each Crucible backend needs to be able to reach all three downstairs specified in the instance spec in order to activate. If a server is unreachable, the startup process hangs until it becomes reachable.
The control plane provides a solution to this problem. If a downstairs becomes degraded, it can set up a new downstairs to take its place and amend the volume configurations (i.e. the
VolumeConstructionRequest
s, or VCRs) of any affected disks so that they refer to the new downstairs. When a Crucible client connects to these three downstairs, they will observe that repair is needed, and the two healthy servers will send their data over to the newly-created server.This process requires a Crucible client to coordinate the repair. When a disk is not attached to an instance, or its instance is stopped, the control plane spins up a Crucible pantry job that spawns such a client. But when a disk is attached to an active instance, there's already a Crucible client in the instance's Propolis process, so the control plane sends a request directly to Propolis to ask it to send new volume configuration to its existing upstairs.
The problem
Propolis queues requests to change a VCR to its state driver task. This is meant to minimize our exposure to tricky concurrency problems: there's no need to worry about, say, changing Crucible configuration during a migration if migrations and Crucible configuration changes are serialized by the state driver queue!
The problem here is that VM startup is also a state driver operation. While Crucible's
Volume
abstraction supports replacing a VCR on a volume that is still activating, if the state driver is wedged waiting for an activation to complete, there's no one available to issue that request.1The request here is for a better approach to handling VCR replacement that avoids concurrency pitfalls but still allows VCR replacements to go through when an instance is starting or stopping.
Note that this is not a problem for disk snapshot requests, because those don't get queued to the state driver. This is because snapshot requests don't mutate the VM's configuration (in a way that might conflict with a consumer of that configuration, like VM startup or VM live migration), whereas VCR replacement requests do change how the VM is configured.
Footnotes
In fact the state driver's queuing logic currently rejects requests to queue a VCR replacement when a VM is Starting, but if it didn't do that the requests would just get stuck on its queue in this scenario. ↩
The text was updated successfully, but these errors were encountered: