-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boothook not executing for cloud-init 'ubuntu/23.3.1-0ubuntu1_20.04.1' #4572
Comments
@supershal , thanks for the bug report. I'm having a hard time understanding exactly what the root issue is, and given your reproducer, there's no difference in behavior between 23.3 and any earlier versions. The reproducer fails because you're referencing a file that does not exist yet, but this should be expected. It would be helpful to get either the exact userdata that was used or the tarball resulting from |
@TheRealFalcon We found the root cause. We are running cloud-init We were able to run cloud-init successfully by directly setting Can you please provide there is any instructions to set feature overrides when creating an instance. |
There is no supported way. The documentation could be more clear, but these flags are not meant to be user-modifiable. They are exclusively for distro packagers (i.e., people who will support cloud-init on their OS distribution) of cloud-init to ease patching the code. By modifying the features, you are essentially modifying the source of cloud-init and are outside the supported path.
I think the failure you highlighted is actually highlighting a root cause further up the chain. Cloud-init will raise the error that you're seeing if it cannot unzip some gzipped userdata, or if it cannot download/process an
When you do this, do find WARNINGs in |
The current implementation is relying on
This mechanism was working before feature flag override was removed.
Yes. we do get the warning in the log about non existence of |
By "restart cloud-init", are you referring to this line?
This basically hacks a whole custom datasource definition into a boothook. That is unexpected, and I'm honestly surprised that this ever worked. |
I think that what you really want is a custom datasource definition. Cloud-init's datasources are responsible for getting the user-data from the platform that cloud-init is running on. Doing this in the expected way (defining a datasource) means that in the long term you will experience less broken behavior, since what you are doing in that boothook is really what a datasource definition is for. This will prevent you from having to hack cloud-init to override userdata or restart cloud-init or anything like that. With a custom datasource it will "just work" during boot on the first try. To do this you will need to:
Once you do this, at runtime cloud-init will discover the datasource and then use it to get the user data. How to define a datasource python module:Unfortunately, this isn't well documented. We have a todo item for that, but these are the core requirements for a datasource, and some pointers on how to get started:
Example DataSourceSSM.py from cloudinit.sources import (
DEP_FILESYSTEM, DEP_NETWORK, DataSource, list_from_depends
)
class DataSourceSSM(DataSource):
def _get_data(self):
"""get configuration data from platform
Puts configuration data in the following properties:
self.metadata
self.userdata_raw
Returns True on success, False on failure
"""
pass
# Return a list of data sources that match this set of dependencies.
def get_datasource_list(depends):
return list_from_depends(
depends,
[(DataSourceSSM, (DEP_FILESYSTEM, DEP_NETWORK))],
) [1] I assume your ansible role would do this. |
Thank you @holmanb for providing the instructions for custom python module. We will look into it for long term. Currently we decided to keep the boothook which will run and place the Do you see any issue with this approach? |
Cloud-init has serveral boot stages and by restarting only See the documentation on how to re-run cloud-init. |
@supershal I agree with @TheRealFalcon's statements, he explains well the broken behavior that I was alluding to. This is why I asked about the line where you restart the cloud-init.service. |
Thank you @TheRealFalcon and @holmanb for your inputs and links to the docs. We will followup the guide to come up with better solution as you suggested. we can close this issue now. |
Bug report
Cluster API Provider AWS uses boothook to run a script that fetches userdata from the metadata server and save it to a file.
Which is executed next to initialize the instance.
The Ubuntu 20.04 AMI comes with cloud-init 'ubuntu/23.3.1-0ubuntu1_20.04.1' installed with it. The cloud-init fails to run
#boothook
on this version which fails the bootstrap process. The related bug is filed at kubernetes-sigs/image-builder#1333Please suggest if there is a better way to test or simulate boothook execution.
Steps to reproduce the problem
Create a base AMI with ERROR_ON_USER_DATA_FAILURE=false to disable the feature.
Without the above settings the instance does not initialize and I am unable to ssh to it.
Feature Info: https://cloudinit.readthedocs.io/en/20.3/topics/hacking.html#cloudinit.features.ERROR_ON_USER_DATA_FAILURE
Instruction to override it: https://cloudinit.readthedocs.io/en/20.3/topics/hacking.html#module-cloudinit.features
Create an AWS instance using the AMI created above and following sample user-data.
user-data.txt:
/var/log/cloud-init-output.log
. log content posted belowEnvironment details
ubuntu/23.3.1-0ubuntu1_20.04.1
Ubuntu-20.04
AWS
. Base AMI:ami-04bad3c587fe60d89
(name:ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20230112
)cloud-init logs
The text was updated successfully, but these errors were encountered: