Skip to content
This repository has been archived by the owner on May 30, 2023. It is now read-only.

Create default grub config files on install/upgrade #373

Closed
wants to merge 2 commits into from
Closed

Create default grub config files on install/upgrade #373

wants to merge 2 commits into from

Conversation

Itxaka
Copy link
Contributor

@Itxaka Itxaka commented Nov 14, 2022

By default we load some files in our grub.cfg from the state partiton in order to allow derivatives to overrides several grub configs.

This files not existing is usually not an issue, just a small message that grub failed to found them during boot, nothing else.

Unfortunately on some machines this can lead to grub hanging out for 30 minutes while trying to search for those files.

This patch fixes that by create any missing files during install and upgrade with empty contents, so grub can find them instantly and avoids the errors and the hang.

The files are formatted in the proper grub format so this does not prevent from editing them afterwards.

The missing files func is done at the end of the rebrand in both install and upgrade, by that time any files from the install/upgrade sources should already be there, so it should not override anything custom that its coming from the install/upgrade sources.

Fixes #372

Signed-off-by: Itxaka [email protected]

@Itxaka Itxaka requested a review from a team November 14, 2022 10:35
By default we load some files in our grub.cfg from the state partiton in
order to allow derivatives to overrides several grub configs.

This files not existing is usually not an issue, just a small message
that grub failed to found them during boot, nothing else.

Unfortunately on some machines this can lead to grub hanging out for 30
minutes while trying to search for those files.

This patch fixes that by create any missing files during install and
upgrade with empty contents, so grub can find them instantly and avoids
the errors and the hang.

The files are formatted in the proper grub format so this does not
prevent from editing them afterwards.

The missing files func is done at the end of the rebrand in both install
and upgrade, by that time any files from the install/upgrade sources
should already be there, so it should not override anything custom that
its coming from the install/upgrade sources.

Signed-off-by: Itxaka <[email protected]>
Copy link
Contributor

@frelon frelon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment on elemental error

pkg/action/upgrade.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Nov 14, 2022

Codecov Report

Base: 76.33% // Head: 76.33% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (0a44e0d) compared to base (3b2e929).
Patch coverage: 61.53% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #373   +/-   ##
=======================================
  Coverage   76.33%   76.33%           
=======================================
  Files          59       59           
  Lines        5941     5967   +26     
=======================================
+ Hits         4535     4555   +20     
- Misses       1092     1096    +4     
- Partials      314      316    +2     
Impacted Files Coverage Δ
pkg/constants/constants.go 95.23% <ø> (ø)
pkg/action/install.go 84.23% <25.00%> (-1.20%) ⬇️
pkg/action/upgrade.go 58.40% <25.00%> (-0.61%) ⬇️
pkg/elemental/elemental.go 87.58% <60.00%> (-0.63%) ⬇️
pkg/utils/grub.go 56.64% <100.00%> (+1.39%) ⬆️
pkg/utils/fs.go 62.00% <0.00%> (+0.66%) ⬆️
tests/mocks/runner_mock.go 100.00% <0.00%> (+4.91%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Itxaka Itxaka requested a review from frelon November 14, 2022 13:12
Copy link
Contributor

@frelon frelon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mjura mjura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a couple of issues to address:

  1. There is mix of config files and grub2 env files. Note the first ones are sourced and the latter ones are loaded. Config files are text files, grub2 env files are kind of binaries. We can't treat them with the same code.

  2. With this change we remove the possibility of adding grub data from other block devices (which is an arguable and probably not intended feature). If we agree to not providing the opportunity to add extra grub config stored in any device/partition then there is no need to call search file for all of them and make the grub.cfg way more resilient by simply testing for the existence of the file. In that case we don't have to create it too.

So I think we need to agree on the behavior first. If we assume these files are fixed in known locations I'd be in favor of simply a sentinel file or something equivalent on the partitions just to get the block devices to vars in grub.cfg. Then use these block devices vars to just test if files are there or not. This way we remove many calls to search file.

Probably this is a bit harsh, but what about providing our own grub config in elemental teal, so we de-attach from elemental-toolkit and then explore how to push some backward compatible improvements upstream if possible? I am not a fan of it as this is almost like abandoning grub2 config package from elemental-toolkit, but otherwise it is damn difficult applying compatible changes in there.

@@ -354,3 +354,15 @@ func (g Grub) SetPersistentVariables(grubEnvFile string, vars map[string]string)
}
return nil
}

// CreateConfigFile will create the config file given if it doesn't exist
func (g Grub) CreateConfigFile(file string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd rename the method to something else (probably CreateGrubEnvFile?) otherwise it looks like it is used to create grub.cfg

// CreateDefaultGrubConfigFiles creates the expected files that the grub.cfg is going to try to load if they don't exist
func (e Elemental) CreateDefaultGrubConfigFiles(partMountPoint string) error {
grub := utils.NewGrub(e.config)
for _, file := range []string{cnst.GrubEnv, cnst.GrubCustom, cnst.GrubMenu} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm GrubCustom and GrubMenu won't be loaded, they will be source, because they really are config files. Grub env, is not.

I am wondering if the result of grub2-editenv to create a new file is actually functional for source <file> . I'd bet it will fail. For any sourced file I'd simply write a text file. We could even eventually write it with a default comment header explaining what is this file. The header could be simply constant in constants package.

I am wondering if it would even make sense to merge custom and menu, after all they simply source something at the same level. The contents of both could be swapped and the result would be same 😱 Anyway this is an unrelated issue I just realized 😓

Also after another thought, I realized we probably could remove almost every single search file by a simple test file existance. In grub2 the search file should be used to get the block device containing the file. Then, knowing the block device, I'd say we can simply check if the file exists. By reading grub2 docs it looks like we could do something like:

if [ -e "${oem_env_file}" ] ; then                                                                                                                                                                                                                         
  load_env -f "(${state_blk})${oem_env_file}"                                                                                                                                                                                                        
fi

Finally I would say the /grub_oem_env seams to be missing from the list.

Copy link
Contributor Author

@Itxaka Itxaka Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally I would say the /grub_oem_env seams to be missing from the list.

Yes, because we DO create that one as part of the install/upgrade on the rebrand :)

We store it in the state partition thought, not in OEM as our toolkit examples :D

@@ -142,5 +142,8 @@ const SetDefaultGrubEntry = 45
// Error occurred during selinux relabeling
const SelinuxRelabel = 46

// Error creating default grub config files
const GrubConfigFiles = 47
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd call them different not to be confused with grub.cfg. So something like GrubEnvFiles?

@Itxaka
Copy link
Contributor Author

Itxaka commented Nov 14, 2022

I think there are a couple of issues to address:

1. There is mix of config files and grub2 env files. Note the first ones are sourced and the latter ones are loaded. Config files are text files, grub2 env files are kind of binaries. We can't treat them with the same code.

Good point, I didnt even notice!

2. With this change we remove the possibility of adding grub data from other block devices (which is an arguable and probably not intended feature). If we agree to not providing the opportunity to add extra grub config stored in any device/partition then there is no need to call search file for all of them and make the grub.cfg way more resilient by simply testing for the existence of the file. In that case we don't have to create it too.

So I think we need to agree on the behavior first. If we assume these files are fixed in known locations I'd be in favor of simply a sentinel file or something equivalent on the partitions just to get the block devices to vars in grub.cfg. Then use these block devices vars to just test if files are there or not. This way we remove many calls to search file.

I think its a bug, it should not be intended to have a random order to load override files (do we know what device grub will find first if there is more than one of those files in the system?) and something that can lead to wrongly reported issues due to the overrides.

IMO as stated before, only one place of truth should be available for the overrides. Its possible to use yip to generate those files, but htere should be one place to store them. I do not care if its OEM, STATE or even persistent, but it should eb clarified that is the only possible place to add the overrides. I mean, we already set the path /oem/grubenv in the toolkit docs, so that seems a good place as any.

And yeah, that would cut most of the search+load from the default grub.cfg! Only one search then a load for those files as they should be in the same place.

Probably this is a bit harsh, but what about providing our own grub config in elemental teal, so we de-attach from elemental-toolkit and then explore how to push some backward compatible improvements upstream if possible? I am not a fan of it as this is almost like abandoning grub2 config package from elemental-toolkit, but otherwise it is damn difficult applying compatible changes in there.

That is good, but that doesnt solve the core issue. It does in terms of Teal but not for Harvester or anything using elemental-cli to manage OS :)

BTW, I think this is mostly backwards compatible or we could probably do it backwards compatible by searching in the OEM partition for those files and if they dont exists we can fallback to just search everywhere. That would make it backwards compatible. That mixed with a new elemental-cli that creates those files if they dont exist should make it easy to use a newer cli with an older system.

In any case, we can always state that the cli version X is only compatible with grub2-config version Y and upwards so no need to maintain compatibility there, you use an older versions (which is probably associated to an older grub2-confg version thanks to the vVERSION-repository.yaml luet repos) and call it a day

If you are already using the vVERSION-repository.yaml then you get the given deps. If you use the main repository.yaml then you get both updated packages. Not sure we can do anything else here, we have pinned repos for that exact reason in our luet packages :D

@Itxaka Itxaka marked this pull request as draft November 15, 2022 10:17
@Itxaka
Copy link
Contributor Author

Itxaka commented Nov 15, 2022

According to our docs, you can put the file anywhere ¬_¬

I kind of hate that, its not conclussive, there should be just one place for the grub files and that is it so you know where to look for those.

IMHO, we should move that into supporting 1 partition (STATE would be nice in my opinion) and use that to search for the files. Also lets have them inside the grub2 directory that is already there instead of scattered around if possible?

In fact, boot assesment package already makes it clear that only the STATE partition is supported as its what its used to enable it. WE should probably just go witht he state partition and call it a day, otherwise we migth have issues. We can just bump the packages as required

cc @davidcassany

@davidcassany
Copy link
Contributor

I am not sure about having all them in STATE partition. It is supposed to be an immutable system, hence I would not store in there runtime states like next boot menu option. Also that pushes us to keep doing rw remounts of the partition to add extra values and event having to document so... Then why not documenting how to hack grub.cfg directly? I am not a fan keep mutating files in state partition beyond installation or upgrade procedures.

What if we handle all this in oem partition? Probably we need to clarify usage of all of them first.

By looking at the current grub.cfg we have:

  • /grubenv, is a variables file. This used to be anywhere and the documented to example is to set next default boot in OEM partition.
  • /grub_oem_env, is a variables file. This is used to keep default grub menu entry set at install/upgrade time and it is currently hardcoded in STATE partition. So I'd say this docs are at least misleading, since this variable is already defined in another file during install, isn't it?
  • /grubcustom, is a sourced configuration file. This is is used somehow within the boot assessment logic managed via cloud-init files and stored in STATE partition. I am wondering if we could get rid of all this hacky rw remounts and move it to a writable area.
  • /grubmenu, is a sourced configuration file sourced at the same level as the /grubcustom and it is documented to be loaded from anywhere.

Would it be actually bad moving all variables to into a single file under OEM partition (merge /grubenv and /grub_oem_env)?
And the same for the config files, would it be bad merging /grubcustom and /grubmenu all in in a single palce under OEM?

At a glance I don't see issues with it and I believe it can make our life easier. So additional extra features like the boot assessment (provided by cloud-init files) can be simply enable/disabled or reset by managing OEM state files and user defined customizations too. Any data in there should be easy to be recreated at any time.

@Itxaka
Copy link
Contributor Author

Itxaka commented Nov 15, 2022

I am not sure about having all them in STATE partition. It is supposed to be an immutable system, hence I would not store in there runtime states like next boot menu option. Also that pushes us to keep doing rw remounts of the partition to add extra values and event having to document so... Then why not documenting how to hack grub.cfg directly? I am not a fan keep mutating files in state partition beyond installation or upgrade procedures.

What if we handle all this in oem partition? Probably we need to clarify usage of all of them first.

To be honest, I did not want to do that as OEM is not really touched as part of upgrade/install so its adding an extra piece in there.

Also, I dont see that much movement in the grub custom config files to be honest. Im not sure who is even using the next_boot (other than our tests) and I think than those grub changes are not supposed to be done once the install is already done. Any grub changes/configs should be done via yip files and hooked to the after_install/after_upgrade so its clear where it goes and its part of the immutability.

Moving them to OEM means that at any point the grub config can be modified at will, even by error when testing stuff, which is ugly and goes against the whole immutability. IF someone wants to modify the grub config files on a running system, it has to be a conscious effort, i.e. Add an extra yip config in the system (its tracked and can find the file afterwards, no matter how much time has passed) OR via mounting the system RW and throwing the whole immutability thing out of the window.

IMHO having the grub.config in the state dir is the way of maintaining the immutability of the SYSTEM (which includes the booting) intact. Having it on OEM is risky. In fact I would have never go for those grub overrides and instead opted for the after_install/after_upgrade hooks to modify the existing functionality.

Also, to see the efficiency of our overrides, harvester just forces a different grub IIRC and kairos has those after_install/after_upgrade hooks to override those. Not touching the config/env stuff. I think we did a mistake with those, but probably didnt had any better at that point.

@davidcassany
Copy link
Contributor

All right then what about this:

  • Keep a single variables file (/grubenv or any other name/path) in STATE partition and adapt docs to reflect its usage in a post-install/upgrade hook.
  • Keep a single configuration file (/grubcustom or any other name/path) in STATE partition, adapt the boot assessment to not take exclusive control of it at boot time (probably by appending to it as part of a post-install hook) and document its usage in post-install/upgrade hook (e.g. adding an extra menu block)
  • During installation and reset an empty file is created for the config file (/grubcustom) and an empty variables file (/grubenv) is created to (note it should be created despite the rebrand, a this step does not necessarily write any file)
  • In grub.cfg we discover the STATE partition only once with the same criteria as in grub_efi config, so we are consistent and assure the same behavior in both sides.

As a result most, if not all, the grub2 customization docs would be translated in two sections, one for /grubenv and one for /grubcustom files. We could document them using current documented customizations uses cases as example, so we do not drop features from docs.

Then, as bonus follow up, we could also consider removing the weird logic around /etc/os-release and /grubenv in elemental-cli and place it as a cloud-init for backward compatibility if needed or even set as part of elemental-cli install config options to provide /grubcustom and /grubenv contents. This way we could eventually make it really easy to ass at install time calling extra modules (e.g squashfs) or extra kernel flags (e.g selinux).

@Itxaka
Copy link
Contributor Author

Itxaka commented Nov 16, 2022

All right then what about this:

* Keep a single variables file (`/grubenv` or any other name/path) in STATE partition and adapt docs to reflect its usage in a post-install/upgrade hook.

+1

* Keep a single configuration file (`/grubcustom` or any other name/path) in STATE partition, adapt the boot assessment to not take exclusive control of it at boot time (probably by appending to it as part of a post-install hook) and document its usage in post-install/upgrade hook (e.g. adding an extra menu block)

+1 but I would need first to understand the boot assessment stuff. To be honest I havent looked at it nor used it so I have no idea of what it does or how :D

* During installation and reset an empty file is created for the config file (`/grubcustom`) and an empty variables file (`/grubenv`) is created to (note it should be created despite the rebrand, a this step does not necessarily write any file)

+1

* In `grub.cfg` we discover the STATE partition only once with the same criteria as in grub_efi config, so we are consistent and assure the same behavior in both sides.

+1

As a result most, if not all, the grub2 customization docs would be translated in two sections, one for /grubenv and one for /grubcustom files. We could document them using current documented customizations uses cases as example, so we do not drop features from docs.

+1

Then, as bonus follow up, we could also consider removing the weird logic around /etc/os-release and /grubenv in elemental-cli and place it as a cloud-init for backward compatibility if needed or even set as part of elemental-cli install config options to provide /grubcustom and /grubenv contents. This way we could eventually make it really easy to ass at install time calling extra modules (e.g squashfs) or extra kernel flags (e.g selinux).

Ummm. You mean the GRUB_ENTRY_NAME in os-release? I think that was done so you can always set the proper grub name for entries during install/upgrade. Maybe its time to revisit that, ideally that was set as part of grub but we had to adapt it later to use that, cant remember why. Makes sense to recheck that.

@Itxaka
Copy link
Contributor Author

Itxaka commented Nov 16, 2022

closing this in favour of changes first in toolkit, will resend once done.

@Itxaka Itxaka closed this Nov 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create default grub files
4 participants