Distro support #187

kfox1111 · 2018-10-15T18:37:14Z

Are there any plans to update this operator to work with RHEL/CentOS? Conceptionally there doesn't seem much CoreOS specific about it. Perhaps it works already?

sdemos · 2018-10-15T20:05:14Z

CLUO does end up being pretty Container Linux specific, particularly in the way that it ties into update_engine to poll for updates. Since CLUO doesn't have any real control over the underlying update process, it's really just locksmith running as a daemonset in kubernetes. In general, we are preferring newer tools that have much more direct control over the update process, such as the machine config daemon, which ties into rpm-ostree directly to update the operating system. That one is specifically for Red Hat CoreOS right now.

There was some early exploratory work that integrated this codebase directly with rpm-ostree (https://github.com/ashcrow/container-linux-update-operator/tree/spike) but the focus has been on the MCD system. As far as I know, there is no equivalent tool that integrates with dnf or any other package management systems.

kfox1111 · 2018-10-15T20:47:59Z

What about a yum plugin that called 'locksmithctl send-need-reboot' on any change? It may reboot more then needed, but could work? Alternately, could you just buypass the locksmith and label the node directly? would the rest of the reboot logic work in that case?

sdemos · 2018-10-15T21:14:20Z

Sorry for the confusion. I meant that it is architecturally and behaviorally like locksmith, not that it is literally locksmith. The CLUO agent hooks directly into update_engine through it's exposed DBUS API (

container-linux-update-operator/pkg/updateengine/client.go

Line 61 in 4bb1486

c.object = c.conn.Object("com.coreos.update1", dbus.ObjectPath(dbusPath))

) and whenever update_engine applies a new update (entirely out of band, like on any container linux instance), the reboot coordinator component confirms that only one gets rebooted at a time. The reboot logic might work, but again, there is nothing in CLUO that actually triggers an update, and it's not architected to do that.

dghubble · 2018-10-15T21:35:23Z

I think we always intended Fedora/RHEL would be designed quite differently, as a different reboot coordinator app.

kfox1111 · 2018-10-16T00:31:36Z

Do you see the logic around picking nodes, draining, rebooting, and uncordoning as being distro specific? I could see the node agent being specific. Does the reboot manager pay attention to any other state then needs upgrading?

I was thinking of trying to set up ansible to point yum at the new version repo (we version mirror snapshots), yum upgrade, and trigger the locksmith and let the operator reboot things safely. cicd would trigger ansible to upgrade the nodes and the operator would reboot them as needed safely? Alternately, it could maybe skip locksmith entirely and just set node labels directly?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distro support #187

Distro support #187

kfox1111 commented Oct 15, 2018

sdemos commented Oct 15, 2018

kfox1111 commented Oct 15, 2018

sdemos commented Oct 15, 2018

dghubble commented Oct 15, 2018

kfox1111 commented Oct 16, 2018

Distro support #187

Distro support #187

Comments

kfox1111 commented Oct 15, 2018

sdemos commented Oct 15, 2018

kfox1111 commented Oct 15, 2018

sdemos commented Oct 15, 2018

dghubble commented Oct 15, 2018

kfox1111 commented Oct 16, 2018