-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gNOI OS - service that provides an interface for OS installation on a Target #19
Conversation
Hi Sam, do you know how this compares against the SetPackage RPC? https://github.com/openconfig/gnoi/blob/master/system/system.proto#L59 I'm just curious if instead we should be adding RPCs or adding proto fields to existing places that at first glance have the same intent. |
Hi Eric, the gNOI OS Service prescribes in detail what is expected from the Target for OS installation and Activation. It also bestows the Target with authority over the file system and its management as opposed to having the Client managing it. The SetPackage RPC in contrast is quite generic in its prescription. Operational processes like OS installation deserve better treatment, having this interface well defined saves complexity both at the platform implementation and operator workflows. |
// this service, component firmware upgrade or OS patching must be embedded | ||
// within an OS upgrade. | ||
service OS { | ||
// Install transfers an OS package into the Target. No concurrent Install RPCs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a common way that a failure due to not allowing concurrent requests should be signaled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, line 232: InstallResponse->InstallError->INSTALL_IN_PROGRESS
// | ||
// The Target manages its own persistent storage, and OS installation process. | ||
// It stores a set of distinct OS packages, and always proactively frees up | ||
// space for incoming new OS packages. It is guaranteed that the Target always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, to be clear the expectation is that apart from the current running OS version, any other packages may be arbitrarily deleted by starting a transfer request? Just want to make sure that that is the expectation and that the loss of control is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. The Operator must not have the expectation that the Target is keeping every installed package besides the last installed one, or the active one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when installing a package, the last installed package (if it's not the running one) can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. However, (and this is up to the platform implementation) if the size of the storage is orders of magnitude larger than the typical size of the package, there is value in keeping a buffer of packages. The logic for removing installed packages could for example be FIFO. All of this is up to the platform implementer and out of the scope of this service. All the Client must expect is that the last Installed package is always available. I will make this clearer in the wording.
// | ||
// The Target manages its own persistent storage, and OS installation process. | ||
// It stores a set of distinct OS packages, and always proactively frees up | ||
// space for incoming new OS packages. It is guaranteed that the Target always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when installing a package, the last installed package (if it's not the running one) can be removed.
os/os.proto
Outdated
// first before accepting the transfer from the Client. The syncing progress | ||
// is reported to the client with InstallResponse->SyncProgress messages. | ||
// | ||
// If a switchover is triggered by an external event during the Install RPC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By external event, you mean triggered by a user? A switchover e.g. because of a h/w failure will still go through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything that triggers a switchover that is not related with this service. A user, hardware or software fault or bug. I will update the text to make it clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification. But with some types of failure, there can be no response (I know I'm stating the obvious).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree, the principal idea here is whenever possible, attempt at notifying the client before giving up the gNOI connection.
// system, it will respond with InstallResponse->SyncProgress. In this, | ||
// latter, case - the client does not need to transfer the OS image. This | ||
// value can also be set empty, in which case the OS package is forced | ||
// transferred to the Target. The Target MUST never validate that this value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the version is empty, the target generates a version/name for the package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version string is always a well known OS version string builtin in the OS package and provided by the vendor. Will update text. thanks!
os/os.proto
Outdated
// Service. The Install RPC applies to the Active Supervisor unless | ||
// InstallRequest->TransferRequest->standby_supervisor is set, in which case | ||
// it applies to the Standby Supervisor. One Install RPC is required for each | ||
// Supervisor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it can be installed on the standby supervisor first in scenario 4 below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I have updated the text to reflect that.
// then it replies with ActivateOK. If the Target has the OS package version | ||
// requested in ActivateRequest then it replies with ActivateOK and proceeds to | ||
// boot. In a Target with dual Supervisor, performing this RPC on the Active | ||
// Supervisor triggers a switchover before booting the (old)Active Supervisor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with a dual supervisor, the requirement is to do an ISSU upgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe ISSU upgrade is a bit more involved. This is just stating that if standby_supervisor = False, Then the Active Supervisor is the Target of the package activation, regardless of the active package in the Standby Supervisor. Because the Active Supervisor is rebooting, a switchover needs to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Sam. So the image activation specified here doesn't require ISSU upgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The platform should always attempt performing a failover with the least impact possible. I will update the text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this does not answer your question. The answer is out of of scope of this service. The switchover behavior in regards to non stop forwarding should be a customer to vendor conversation. ISSU is rather specific and packages upgrade and switchover behavior together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed discussion on this on and offline, LGTM.
No description provided.