Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a role to collection converter tool lsr_role2collection.py #13

Merged
merged 2 commits into from
Sep 29, 2020

Conversation

nhosoi
Copy link
Contributor

@nhosoi nhosoi commented Aug 10, 2020

Usage:

lsr_role2collection.py [--namespace NAMESPACE]
                         [--collection COLLECTION]
                         --dest-path DEST_PATH
                         --src-path SRC_PATH
                         --role ROLE | --molecule
                         [--replace-dot REPLACE_DOT]
                         [-h]
  optional arguments:
    -h, --help       show this help message and exit
    common arguments:
    --namespace NAMESPACE
                     Collection namespace; default to fedora
    --collection COLLECTION
                     Collection name; default to system_roles
    --src-path SRC_PATH
                     Path to linux-system-role
    --dest-path DEST_PATH
                     Path to parent of collection where role should be migrated
    roles arguments:
    --role ROLE      Role to convert to collection
    --replace-dot REPLACE_DOT
                     If sub-role name contains dots, replace them with the given value; default to "_"
    molecule arguments:
    --molecule   If set, molecule is copied from SRC_PATH/template, which must exist. In that case "--role
                        ROLE" is ignored.

  example:
    python lsr_role2collection.py --dest-path /tmp/collections \
            --src-path /path/to/lsr ~/linux-system-roles --role certificate
    or
    COLLECTION_SRC_PATH=/path/to/lsr \
    COLLECTION_DEST_PATH=/path/to/collections \
    COLLECTION_ROLE=certificate \
    python lsr_role2collection.py
  1. Role - copy subdirectories, tasks, defaults, vars, etc., in the system role to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/roles/ROLE.
    A string "{{ role_path }}/roles/ROLE" in the copied files is relaced with
    NAMESPACE.COLLECTION.ROLE.

  2. Tests - copy files and dirs in the tests to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/tests/ROLE.
    In the test playbook files, the roles or include_role values, ROLE or
    linux-system-roles.ROLE, are replaced with NAMESPACE.COLLECTION.ROLE.
    With the replacement done, symlinks in tests/roles are removed.

  3. Docs copy docs, design_docs, examples and README.md files to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/docs/ROLE.
    Generate a top level README.md which contains links to docs/ROLE/README.md.

  4. Plugins - copy library, module_utils, plugins to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/plugins.
    Files library are copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/plugins/modules/.
    Directories in plugins are copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/plugins/.
    Files in module_utils are copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/plugins/module_utils/ROLE/.
    Directories in module_utils are copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/plugins/module_utils/.
    The import, from ~ import in the python codes in plugins/{modules,module_utils}
    are adjusted to the collections structre.

  5. Extras - copy extra files and directories including the sub-roles.
    If a sub-role is found in the system role SUBROLE, copy the contents to the
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/roles/SUBROLE.
    README.md in the sub-role is copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/docs/SUBROLE.
    Test playbooks are in the sub-role are copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/tests/SUBROLE.
    If READEM-something.md exists, it's copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/docs/ROLE and
    the link to the file is added the main README.md.
    Any other extra files are copied to
    DEST_PATH/ansible_collections/NAMESPACE/COLLECTION/. If the file has an extension,
    the it's renamed to filename-ROLE.extension. If not, it is to filename-ROLE.

@nhosoi nhosoi requested review from pcahyna, richm and t-woerner August 10, 2020 16:57
type=str,
default=os.environ.get("COLLECTION_NAMESPACE", "redhat"),
help='Collection namespace; default to redhat',
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcahyna

how will this work in the Collection world? Will there be a single namespace for the whole collection? Can we do this in a way that it prepares for future work on collections? @nhosoi

Not sure this answers your question, but when converting a role to a collection with the tool lsr-role2collection.py, you could set an option --namespace YOURNAMESPACE. If you don't give the option, default to redhat (for now, we could change it to something else, e.g., fedora).

FROM_RE = re.compile(
br'(\bfrom) (ansible\.module_utils\.?)(\S+)? import (\(*(?:\n|\r\n)?)(.+)$',
flags=re.M
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcahyna, regarding the change made in linux-system-roles/storage#142,

- from ansible.module_utils.size import Size
+ from ansible.module_utils.storage.size import Size

Both from ~ import ~ lines match the FROM_RE regular expression.

Pre pr/142: the from ~ import ~ is converted to:

plugins/modules/bsize.py:from ansible_collections.redhat.system_roles.plugins.module_utils.size import Size
plugins/modules/find_unused_disk.py:from ansible_collections.redhat.system_roles.plugins.module_utils.size import Size

and size.py is located at:
plugins/module_utils/size.py

Post pr/142: the from ~ import ~ is converted to:

plugins/modules/bsize.py:from ansible_collections.redhat.system_roles.plugins.module_utils.storage.size import Size
plugins/modules/find_unused_disk.py:from ansible_collections.redhat.system_roles.plugins.module_utils.storage.size import Size

and size.py is located at:
plugins/module_utils/storage/size.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@t-woerner , is there a way to keep the same module_utils import path with and without collections and not have to rewrite the Python sources?

Copy link

@t-woerner t-woerner Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a collection all plugins (modules, module_utils, action_plugins,...) need to be moved or cpoied into the collection global plugins directory. Plugins within role directories are not loaded. A prefix for plugins for normal roles is not supported as far as I know.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All plugins need to have a proper prefix and a unique name as they have to be located in the global plugins directory.

parser.add_argument(
'--namespace',
type=str,
default=os.environ.get("COLLECTION_NAMESPACE", "redhat"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use fedora as the default

@@ -0,0 +1,443 @@
#!/usr/bin/env python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok - python3 should be available on all of the el7 platforms we might use

import fnmatch
from shutil import copytree, copy2, ignore_patterns

from pathlib import Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these imports are ok - they are pretty standard and should be available in el7 python3-something packages.


DOCS = (
'docs',
'design_docs',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also the examples directory e.g. used by network - seems more documentation than anything else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples dir was treated as extra and placed in the top fedora/system_roles dir. Now, the files in the examples of each role are in the docs/ROLE.

$ ls fedora/system_roles/docs/
certificate/  logging/	metrics/  nbde_client/	nbde_server/  network/	timesync/

I noticed there is another non-standard directory treated as extra which is "scripts" from network. It contains just one python script print_all_options.py. And the script appears only here in the network code...
./.travis/config.sh: PYTHON2_EXCLUDES="tests/ensure_provider_tests.py,scripts/print_all_options.py"
How shall we handle it? (So far, this is the only an example that falls into the case, but in general, what'd be the best way for the role?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm - not sure how to handle the network/scripts/print_all_options.py - looks like it is used to generate the docs? @tyll

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is not integrated into anything but I only used it manually to create an overview for some items in the README.md.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your input, @tyll! So, the script is needed in the original roles repository, but not in the generated collections. I'm going to skip copying it.

@nhosoi nhosoi changed the title Adding a role to collection converter tool lsr-role2collection.py [WIP] Adding a role to collection converter tool lsr-role2collection.py Aug 18, 2020
@nhosoi nhosoi force-pushed the role2collection branch 3 times, most recently from 9fa88d9 to 56649cc Compare August 25, 2020 00:54
@nhosoi nhosoi changed the title [WIP] Adding a role to collection converter tool lsr-role2collection.py Adding a role to collection converter tool lsr-role2collection.py Aug 25, 2020
find = r"( *name: | *- name: | *- | *roletoinclude: | *role: | *- role: )(linux-system-roles\.{0}\b|{0}\b)".format(role, role)
replace = r"\1" + namespace + "." + collection + "." + role
file_patterns = ['*.yml', '*.md']
file_replace(path, find, replace, file_patterns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work with the network role. For example: https://github.com/linux-system-roles/network/blob/main/README.md

      route:
        - network: 198.51.100.128
          prefix: 26
...

is converted to

      route:
        - fedora.system_roles.network: 198.51.100.128
          prefix: 26

I think we should enforce that every place where the role name is mentioned, it is mentioned using the fully qualified name, and this script should only have to convert references to linux-system-roles.ROLENAME to NAMESPACE.system_roles.ROLENAME. Would that simplify the script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richm, that'd be ideal. Indeed, in addition to solving the bug you found, I can get rid of |{0}\b from the pattern!

Just out of my curiosity, this is the snippet of the python doc. I was hoping I could use ^ and $ in the pattern. Although the doc says it should work in MULTIPLE mode, it did not solve my problem... (or may be I could use \n instead a in [0] to avoid a problem like in the network role?)

$
Matches the end of the string or just before the newline at the end of the string, and in MULTILINE mode also matches before a newline. foo matches both ‘foo’ and ‘foobar’, while the regular expression foo$ matches only ‘foo’. More interestingly, searching for foo.$ in 'foo1\nfoo2\n' matches ‘foo2’ normally, but ‘foo1’ in MULTILINE mode; searching for a single $ in 'foo\n' will find two (empty) matches: one just before the newline, and one at the end of the string.

[0] - find = r"( *name: | *- name: | *- | *roletoinclude: | *role: | *- role: )(linux-system-roles.{0}\n|{0}\n)".format(role, role)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work with the network role. For example: https://github.com/linux-system-roles/network/blob/main/README.md

      route:
        - network: 198.51.100.128
          prefix: 26
...

is converted to (...)

This occurrence of network is not in a place where one expect a role name, so the tool should not even attempt to convert this. Which highlights the importance of using a parser that properly understand the YAML structure.

@richm
Copy link
Contributor

richm commented Aug 31, 2020

@richm, that'd be ideal. Indeed, in addition to solving the bug you found, I can get rid of |{0}\b from the pattern!

I have submitted PRs for these

re: regex - not sure - it can be tricky to match on newlines or multiline matches.

There is also the case where a rolename can be specified in a meta/main.yml in the dependencies list, or in a tasks file in the roles section, like this:

roles:
  - ROLENAME

that will be really tricky to match with a regex

We might have to use a hybrid approach, for yaml files - use the yaml parser to find places where rolenames are referenced, then use regex to actually do the replacement - we don't want to yaml.load -> fix -> yaml.dump for each file, because that might not preserve the comments and might slightly alter the whitespace formatting, line wrapping, etc. That way we don't have to worry about r"( *name: | *- name: e.g. name: storage or name: linux-system-roles.storage being used for something other than a reference to the name of a role to include/import.

But for readme and other doc, we will still need to use regex - not sure how else we could do that, or maybe we should edit those files manually rather than scripted

@nhosoi
Copy link
Contributor Author

nhosoi commented Aug 31, 2020

I have submitted PRs for these

Thanks a lot, @richm! I'm going to remove |{0}\b from the pattern.

find = "{{ role_path }}/roles/(.*)"
replace = ".\\1"
file_patterns = ['*.yml', '*.md']
file_replace(role_dir, find, replace, file_patterns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about cases where the test has something like this:

include_tasks: "{{ role_path }}/roles/rolename/vars/main.yml"

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a sub-role case as rsyslog in logging.
include_tasks: "{{ role_path }}/roles/rolename/tasks/main.yml"
So, rolename is promoted to the same level as the role that includes the task.
I think this'd work.
include_tasks: "NAMESPACE.COLLECTION.rolename/tasks/main.yml"
But how about without NAMESPACE.COLLECTION.? If it is guaranteed, it is simpler and better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the logging case, would it work if you replaced
include_tasks: "{{ role_path }}/roles/rsyslog/tasks/main.yml"
with
include_tasks: "{{ role_path }}/../rsyslog/tasks/main.yml"

or even

include_role:
  name: fedora.system_roles.rsyslog

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging role has only include_role which goes across the role boundary:

  include_role:
    name: "{{ role_path }}/roles/rsyslog"

It does not have include_tasks nor include_vars doing that. Maybe, I should add such test cases locally...

For include_role, I verified both of them work.

include_role:
  name: fedora.system_roles.rsyslog

include_role:
  name: rsyslog

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about cases where the test has something like this:

include_tasks: "{{ role_path }}/roles/rolename/vars/main.yml"

?

I suppose you meant include_vars, is there an example of this? The only one I recall was in kernel_settings and you fixed that. My suggested approach would be "don't do it then" (it caused problems even outside the context of collections and users should not do anything similar, so a test doing this is unrealisitic).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not have include_tasks nor include_vars doing that. Maybe, I should add such test cases locally...

@pcahyna, Ok. As I wrote above, there's no case to cross the role boundary for include_tasks and _vars. We are going to drop supporting them.

Just for record, collections recommend to use the FQCN in include_ and import_tasks.

https://docs.ansible.com/ansible/2.10/collections/ansible/builtin/import_tasks_module.html
https://docs.ansible.com/ansible/2.10/collections/ansible/builtin/include_tasks_module.html
This module (ansible.builtin.import_tasks,include_tasks) is part of ansible-base and 
included in all Ansible installations. In most cases, you can use the short module name 
import_tasks even without specifying the collection: keyword. Despite that, we recommend 
you use the FQCN for easy linking to the module documentation and to avoid conflicting 
with other collections that may have the same module name.

file_patterns = ['*.yml', '*.md']
file_replace(path, find, replace, file_patterns)

find = r"([ -] hosts: [\w\d\.]*\n)"
replace = r"\1 collections:\n - {0}.{1}\n".format(namespace, collection)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use the collections: keyword? Is there some reason we can't just use include_role: name: fedora.system_roles.ROLENAME everywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only case I can think of is when a test playbook uses a module/plugin e.g. using blivet: in a storage test probably won't work unless the test playbook is inside the collection. I think you would have to use something like fedora.system_roles.storage.blivet: in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use the collections: keyword? Is there some reason we can't just use include_role: name: fedora.system_roles.ROLENAME everywhere?

No need to use the collections: keyword. I just thought we as a team are hoping to do so! I can revert the last change.

The same replacement is made on the example playbooks in docs/ROLE (and possibly, example codes in README). Do we want to revert it, as well? Or it is needed for them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought we as a team are hoping to do so!

We're following the guidance from Ansible - https://docs.ansible.com/ansible/latest/user_guide/collections_using.html#using-collections-in-a-playbook
"Once installed, you can reference a collection content by its fully qualified collection name (FQCN):"
"To avoid a lot of typing, you can use the collections keyword added in Ansible 2.8: .... Notice that you still need the FQCN for non-action or module plugins."

And we've been told by the Ansible folks on several occasions that it is better to use the FQCN.

The same replacement is made on the example playbooks in docs/ROLE ...

I'm still not sure what, if anything, we should say to users about how to consume these roles if using collections . . . I think the latest guidance that we discussed is that we should not mention galaxy or collections:, and just use the FQCN in our DOCS/examples - @pcahyna is this your understanding?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should prefer using the FQCN and not having to add collections: to the playbooks.

@pcahyna
Copy link
Member

pcahyna commented Sep 2, 2020

that will be really tricky to match with a regex

We might have to use a hybrid approach, for yaml files - use the yaml parser to find places where rolenames are referenced, then use regex to actually do the replacement - we don't want to yaml.load -> fix -> yaml.dump for each file, because that might not preserve the comments and might slightly alter the whitespace formatting, line wrapping, etc. That way we don't have to worry about r"( *name: | *- name: e.g. name: storage or name: linux-system-roles.storage being used for something other than a reference to the name of a role to include/import.

But for readme and other doc, we will still need to use regex - not sure how else we could do that, or maybe we should edit those files manually rather than scripted

For YAML files, have you or @nhosoi tried ruamel.yaml ? It is suposed to be a YAML parser that preserves formatting and whitespace. I don't think that a regexp-replace based approach will work reliably in the long term.

@pcahyna
Copy link
Member

pcahyna commented Sep 2, 2020

  • With the replacement done, symlinks in tests/roles are removed.

This would be a good candidate for homogenizing repositories before conversion. If we remove the symlinks already in the repos, we will not need to remove them during the conversion process. (It would require an adjustment for the test harness to run the test playbooks with role path set to a directory where the role would be found under its proper name, but this would have other benefits anyway.)

'semaphore',
'standard-inventory-qcow2',
'tuned_requirements.txt',
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of them will go away when we move the shared CI files into a shared git submodule, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of them, yes

@richm
Copy link
Contributor

richm commented Sep 2, 2020

  • With the replacement done, symlinks in tests/roles are removed.

This would be a good candidate for homogenizing repositories before conversion. If we remove the symlinks already in the repos, we will not need to remove them during the conversion process. (It would require an adjustment for the test harness to run the test playbooks with role path set to a directory where the role would be found under its proper name, but this would have other benefits anyway.)

How would this work? If you have in tests_defaults.yml

include_role:
  name: linux-system-roles.rolename

and you are in the rolename/tests directory running tests_default.yml, how would you set ANSIBLE_ROLE_PATHS to find this?

@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 2, 2020

For YAML files, have you or @nhosoi tried ruamel.yaml ? It is suposed to be a YAML parser that preserves formatting and whitespace. I don't think that a regexp-replace based approach will work reliably in the long term.

@pcahyna , I tried a bit in the beginning. I ran this simple code against in.yml [0]. And got this diff [1] between in.yml and out.yml. I thought it did not respect the ansible indentation and stopped my investigation then...

import ruamel
import ruamel.yaml

yaml = ruamel.yaml.YAML()
with open('in.yml') as stream:
    data = yaml.load(stream)
with open('out.yml', 'w') as stream:
    yaml.dump(data, stream=stream)

[0] in.yml

# Test to ensure the logging role successfully runs with the no given variables.
#
- name: Ensure that the role runs with default parameters
  hosts: all
  become: true

  tasks:
    - name: default run (NOOP)
      include_role:
        name: linux-system-roles.logging

[1] diff

--- in.yml  2020-07-24 13:53:50.067798424 -0700
+++ out.yml 2020-07-24 13:58:37.329608458 -0700
@@ -4,7 +4,7 @@
   hosts: all
   become: true

   tasks:
-    - name: default run (NOOP)
-      include_role:
-        name: linux-system-roles.logging
+  - name: default run (NOOP)
+    include_role:
+      name: linux-system-roles.logging

@richm
Copy link
Contributor

richm commented Sep 2, 2020

that will be really tricky to match with a regex
We might have to use a hybrid approach, for yaml files - use the yaml parser to find places where rolenames are referenced, then use regex to actually do the replacement - we don't want to yaml.load -> fix -> yaml.dump for each file, because that might not preserve the comments and might slightly alter the whitespace formatting, line wrapping, etc. That way we don't have to worry about r"( *name: | *- name: e.g. name: storage or name: linux-system-roles.storage being used for something other than a reference to the name of a role to include/import.
But for readme and other doc, we will still need to use regex - not sure how else we could do that, or maybe we should edit those files manually rather than scripted

For YAML files, have you or @nhosoi tried ruamel.yaml ? It is suposed to be a YAML parser that preserves formatting and whitespace. I don't think that a regexp-replace based approach will work reliably in the long term.

Or, as I suggested here #13 (comment) some sort of hybrid approach, where we use a yaml parser to identify the exact location in the file that needs to be changed, but use a string replacer to replace just that occurrence in the file - that way, we have the benefit of something that understands yaml as a parser, but we can do the replacement without altering the structure of the file, so we don't have to worry about a yaml parser/writer that understands how to preserve comments, whitespace, line wrapping, etc.

One concern is that we need something that understands the schema of ansible yaml files - that is - how do we identify every place where a role name could be referenced, and the exact syntax?

@pcahyna
Copy link
Member

pcahyna commented Sep 2, 2020

  • With the replacement done, symlinks in tests/roles are removed.

This would be a good candidate for homogenizing repositories before conversion. If we remove the symlinks already in the repos, we will not need to remove them during the conversion process. (It would require an adjustment for the test harness to run the test playbooks with role path set to a directory where the role would be found under its proper name, but this would have other benefits anyway.)

How would this work? If you have in tests_defaults.yml

include_role:
  name: linux-system-roles.rolename

and you are in the rolename/tests directory running tests_default.yml, how would you set ANSIBLE_ROLE_PATHS to find this?

First of all, the test harness needs to clone the role repository as a directory called linux-system-roles.rolename and not some random name (unlike now). Then it can set the role path to the parent of that directory.

@pcahyna
Copy link
Member

pcahyna commented Sep 2, 2020

For YAML files, have you or @nhosoi tried ruamel.yaml ? It is suposed to be a YAML parser that preserves formatting and whitespace. I don't think that a regexp-replace based approach will work reliably in the long term.

@pcahyna , I tried a bit in the beginning. I ran this simple code against in.yml [0]. And got this diff [1] between in.yml and out.yml. I thought it did not respect the ansible indentation and stopped my investigation then...

@nhosoi that's a bit disappointing indeed, but perhaps not a showstopper. The documentation says: "ruamel.yaml doesn’t preserve individual indentations of block sequence items" https://yaml.readthedocs.io/en/latest/detail.html#indentation-of-block-sequences . But you can apparently configure it to conform to our convention: mapping=2, sequence=4, offset=2. It will cause a change in files that do not conform to the convention, but one might call the enforcement of our convention a feature, instead of a bug.

@richm
Copy link
Contributor

richm commented Sep 2, 2020

First of all, the test harness needs to clone the role repository as a directory called linux-system-roles.rolename and not some random name (unlike now). Then it can set the role path to the parent of that directory.

so git clone https://github.com/linux-system-roles/logging linux-system-roles.logging
we should do the same for local development purposes
then we could do

cd linux-system-roles/linux-system-roles.logging/tests
ANSIBLE_ROLES_PATH=~/linux-system-roles TEST_SUBJECTS=file.qcow2 ansible-playbook -vv -i /usr/share/ansible/inventory/standard-inventory-qcow2 tests_defaults.yml

@richm
Copy link
Contributor

richm commented Sep 2, 2020

It will cause a change in files that do not conform to the convention, but one might call the enforcement of our convention a feature, instead of a bug.

We might still have some older roles that have yaml files that cannot pass yamllint because of this - we could use this feature to clean up those old roles.

@nhosoi nhosoi force-pushed the role2collection branch 2 times, most recently from 2fbe6b1 to 2f7492e Compare September 18, 2020 15:44
@nhosoi nhosoi changed the title Adding a role to collection converter tool lsr-role2collection.py Adding a role to collection converter tool lsr_role2collection.py Sep 18, 2020
@nhosoi nhosoi force-pushed the role2collection branch 8 times, most recently from ed35bfa to e1d8698 Compare September 22, 2020 05:54
tox.ini Outdated
@@ -63,7 +63,7 @@ deps =
ansible
jmespath
commands =
bash -e -c 'if [ -x lsr_roles2collections.sh ]; then COLLECTION_SRC_PATH=/var/tmp/src COLLECTION_DEST_PATH=/var/tmp/collections ROLES="certificate kdump kernel_settings logging metrics nbde_client storage timesync tlog tuned" {toxinidir}/lsr_roles2collections.sh; cd /var/tmp/collections/ansible_collections/fedora/system_roles/tests; ANSIBLE_COLLECTIONS_PATHS=/var/tmp/collections:/usr/share/ansible/collections ansible-playbook --become --become-user root -e 'ansible_python_interpreter=/usr/bin/python3' -vvvv tests_default.yml; rm -rf /var/tmp/src /var/tmp/collections; fi'
bash -e -c 'if [ -x lsr_roles2collections.sh ]; then COLLECTION_SRC_PATH=/var/tmp/src COLLECTION_DEST_PATH=/var/tmp/collections ROLES="certificate kdump kernel_settings logging metrics nbde_client timesync tlog tuned" {toxinidir}/lsr_roles2collections.sh; cd /var/tmp/collections/ansible_collections/fedora/system_roles/tests; ANSIBLE_COLLECTIONS_PATHS=/var/tmp/collections:/usr/share/ansible/collections ansible-playbook --become --become-user root -e 'ansible_python_interpreter=/usr/bin/python3' -vvvv tests_default.yml; rm -rf /var/tmp/src /var/tmp/collections; fi'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the problem here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a well known fact (although I did not know...); Fedora/RHEL/CentOS is not running on the travis machine, but something else (debian?) which runs apt for installing packages. Thus, roles which require redhat specific packages or contain include_var fallbacks with no default.yml have no luck to pass the basic test...

ASK [fedora.system_roles.tlog : install session recording packages] ***********
task path: /var/tmp/collections/ansible_collections/fedora/system_roles/roles/tlog/tasks/main.yml:2
Running apt
<<snip>>

@nhosoi nhosoi force-pushed the role2collection branch 4 times, most recently from 0715e03 to 22e7264 Compare September 22, 2020 16:58
tox.ini Outdated
ansible
jmespath
commands =
bash -e -c 'if [ -x lsr_roles2collections.sh ]; then COLLECTION_SRC_PATH=/var/tmp/src COLLECTION_DEST_PATH=/var/tmp/collections ROLES="certificate kdump kernel_settings logging metrics nbde_client tuned" {toxinidir}/lsr_roles2collections.sh; cd /var/tmp/collections/ansible_collections/fedora/system_roles/tests; ANSIBLE_COLLECTIONS_PATHS=/var/tmp/collections:/usr/share/ansible/collections ansible-playbook --become --become-user root -e 'ansible_python_interpreter=/usr/bin/python3' -vvvv tests_default.yml; rm -rf /var/tmp/src /var/tmp/collections; fi'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be doing this here - tox/travis tests for the auto-maintenance repo should not be pulling in roles repos and running ansible-playbook. That should be done in each individual role.

If we really want to have a test in this repo that does an actual role2collection conversion of an actual role, then perhaps some sort of dummy role in the tests directory - but I think you already did that in the unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current travis env, roles: [certificate kdump kernel_settings logging metrics nbde_client tuned] passed the test. Is it ok to keep it? Or do you think there's no value added to the unit test?

Regarding the integration test in each individual role, I agree we should add there to make sure the changes made on the role do not break the collection format or to test the conversion tool if it handles the roles correctly or not in the roles pr. I'm wondering if it's done in travis, do we have the same non-redhat os issue there, too? Or do we have a good workaround?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to keep it? Or do you think there's no value added to the unit test?

I would prefer not to keep it - even though it is working now, it will be a burden to maintain in the future.

Regarding the integration test in each individual role

There are two types of tests that will be run against each individual role:

  • tox/travis test e.g. cd linux-system-roles/logging ; tox -e collection - this test will be developed in the template/ repo and synced to the other roles (or we will use the new https://github.com/linux-system-roles/meta_test/ that @i386x is working on)
  • integration tests - we'll need some way to build ROLENAME into a collection then run ROLENAME/tests/tests*.yml against the collection.

dr = "__" + sr.name.replace(".", replace_dot)
# convert nested subroles to prefix name with subrole_prefix.
if sr.name.startswith("_"):
sr.name = re.sub("^_*", "", sr.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you trying to handle the case where a role may have a sub-role name that begins with _? If so, we should assume the user knows what they are doing - do not transform it to have the subrole_prefix, and add it to the no_readme_link list - that is, assume the user is trying to denote the sub-role as a non-public sub-role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be a case some sub-roles start with _ and others do with __, and we want to make them consistent either starting with _ or __?

Public sub-role vs. non-public sub-role, could there be a good indication? Currently, just metrics and logging have sub-role. Metrics is moving pcp to a separate collection. So, I thought it's better to be a non-public since it's going to go away. For rsyslog, we have no plan to make it public.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be quite rare for role developers to use sub-roles, and the system roles team should provide guidance/pr reviews/etc. about naming. If someone really, really wants to create a sub-role name that begins with _ or __ for the purpose of denoting that it is private, and they are not persuaded by our arguments to the contrary, then we should allow it, and not attempt to rename it when we convert the role to a collection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Then, is it ok to add subrole_prefix (_ or __) only when the sub-role name does not start with _?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Then, is it ok to add subrole_prefix (_ or __) only when the sub-role name does not start with _?

Yes.

@nhosoi nhosoi force-pushed the role2collection branch 3 times, most recently from ad968f5 to 725b99b Compare September 24, 2020 17:20
@@ -889,7 +907,11 @@ def add_rolename(filename, rolename):
if extra.name == "roles":
for sr in extra.iterdir():
# If a role name contains '.', replace it with replace_dot
dr = sr.name.replace(".", replace_dot)
# convert nested subroles to prefix name with subrole_prefix.
if sr.name.startswith(subrole_prefix):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if subrole_prefix and sr.name.startswith(subrole_prefix):

Copy link
Contributor

@richm richm Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm - what about this instead:

dr = sr.name.replace(".", replace_dot)
if subrole_prefix and not dr.startswith(subrole_prefix):
    dr = subrole_prefix + dr

@nhosoi nhosoi force-pushed the role2collection branch 2 times, most recently from 17d20e3 to c80746f Compare September 29, 2020 21:00
…at to collections format

[Conversion script]
usage: lsr_role2collection.py [-h] [--namespace NAMESPACE] [--collection COLLECTION]
                              [--dest-path DEST_PATH] [--src-path SRC_PATH] [--role ROLE]
                              [--replace-dot REPLACE_DOT] [--subrole-prefix SUBROLE_PREFIX]

optional arguments:
  -h, --help            show this help message and exit
  --namespace NAMESPACE
                        Collection namespace; default to fedora
  --collection COLLECTION
                        Collection name; default to system_roles
  --dest-path DEST_PATH
                        Path to parent of collection where role should be migrated
  --src-path SRC_PATH   Path to linux-system-roles
  --role ROLE           Role to convert to collection
  --replace-dot REPLACE_DOT
                        If sub-role name contains dots, replace them with the specified value;
                        default to '_'
  --subrole-prefix SUBROLE_PREFIX
                        If sub-role name does not start with the specified value, change the name to
                        start with the value; default to an empty string

[Helper script]
lsr_roles2collections.sh
Copy link
Contributor

@richm richm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nhosoi nhosoi merged commit 7100d25 into linux-system-roles:master Sep 29, 2020
@nhosoi
Copy link
Contributor Author

nhosoi commented Sep 29, 2020

Thank you, @richm. Merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants