Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Support container_registry in modules #2291

Closed
wants to merge 21 commits into from

Conversation

edmundmiller
Copy link
Contributor

@edmundmiller edmundmiller commented Oct 13, 2022

Motivated by: https://aws.amazon.com/blogs/hpc/biocontainers-are-now-available-in-amazon-ecr-public-gallery/

They don't have mulled images being pushed yet however, but this should support other urls as well so it's a nice change.

modules/nf-core/bowtie/align/main.nf Outdated Show resolved Hide resolved
modules/nf-core/bowtie/align/main.nf Outdated Show resolved Hide resolved
@edmundmiller
Copy link
Contributor Author

  Unable to find image 'public.ecr.aws/biocontainers/mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:676c5bcfe34af6097728fea60fb7ea83f94a4a5f-0' locally
  docker: Error response from daemon: repository public.ecr.aws/biocontainers/mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3 not found: name unknown: The repository with name 'mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3' does not exist in the registry with id 'biocontainers'.
  See 'docker run --help'.

:(

@grst
Copy link
Member

grst commented Oct 13, 2022

Is this accessible from outside aws?

@apeltzer
Copy link
Member

When looking at the ECR gallery, @emiller88 and I were unable to find the mulled containers there - need to ask AWS folks on whether this is not mirrored (yet?).

@delagoya - can you share some insights on this?

@edmundmiller edmundmiller changed the title Aws biocontainers refactor: Support docker_url in modules Oct 13, 2022
@mahesh-panchal
Copy link
Member

mahesh-panchal commented Oct 13, 2022

My suggestion is to use a profiles to set a parameter to supply the container registry and then simplify the ternary expression.

profiles {
    singularity {
         singularity.enabled = true
         params.container_registry = 'https://depot.galaxyproject.org/singularity'
     }
     docker {
         docker.enabled = true
         params.container_registry = 'quay.io/biocontainers'
     }
     aws {
          params.container_registry = 'public.ecr.aws'
     }
}

and then the process tag looks like:

process TOOL_SUBTOOL {
    container "${ (params.container_registry ?: 'quay.io' ) + 'image:version--build' }"
}

@edmundmiller
Copy link
Contributor Author

edmundmiller commented Oct 13, 2022

My suggestion is to use a profiles to set a parameter to supply the container registry and then simplify the ternary expression.

This is definitely the cleanest. I think it still keeps that the modules "just work" but if they want to support singularity in a non-nf-core pipeline you'll have to add the singularity profile, if you don't want to convert.

@edmundmiller edmundmiller marked this pull request as ready for review October 13, 2022 10:43
@mahesh-panchal
Copy link
Member

mahesh-panchal commented Oct 13, 2022

Also one can just override from the command-line or -params-file if they want to use the docker registry from singularity or use a private registry.

@edmundmiller
Copy link
Contributor Author

Also one can just override from the command-line or -params-file if they want to use the docker registry from singularity or use a private registry.

Great point! I think that solves it.

@edmundmiller
Copy link
Contributor Author

@mahesh-panchal Thoughts on:

"${ ... }"

vs. just a closure? {}

@mahesh-panchal
Copy link
Member

@mahesh-panchal Thoughts on:

"${ ... }"

vs. just a closure? {}

Container needs a string in this context I think.

@edmundmiller
Copy link
Contributor Author

Container needs a string in this context I think.

98f8156

This works 🤷

@mahesh-panchal
Copy link
Member

If the string quotes are not needed, then you don't need the { } either.

container (params.container_registry ?: 'quay.io' ) + 'mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:676c5bcfe34af6097728fea60fb7ea83f94a4a5f-0' 

@edmundmiller
Copy link
Contributor Author

If the string quotes are not needed, then you don't need the { } either.

No dice

Unknown process directive: `plus`

Did you mean of these?
        cpus

 -- Check script 'tests/modules/nf-core/bowtie/align/../../../../../modules/nf-core/bowtie/align/main.nf' at line: 6 or see '.nextflow.log' file for more details

@edmundmiller
Copy link
Contributor Author

    def image = "mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:676c5bcfe34af6097728fea60fb7ea83f94a4a5f-0"
    conda (params.enable_conda ? 'bioconda::bowtie=1.3.0 bioconda::samtools=1.15.1' : null)
    container { (params.container_registry ?: 'quay.io' ) + image }

Thoughts? I think this is clean.

@mahesh-panchal
Copy link
Member

Maybe then

If the string quotes are not needed, then you don't need the { } either.

No dice

Unknown process directive: `plus`

Did you mean of these?
        cpus

 -- Check script 'tests/modules/nf-core/bowtie/align/../../../../../modules/nf-core/bowtie/align/main.nf' at line: 6 or see '.nextflow.log' file for more details

Hmm. Maybe then just ( ) is needed instead of { }. This may be a parsing context thing. I don't see why a closure is needed here.

@grst
Copy link
Member

grst commented Oct 13, 2022

If the string quotes are not needed, then you don't need the { } either.

I think one of the two is needed when using custom config files with -c custom.config to delay the evaluation until the params are ready. But I heard -c is discouraged nowadays anyway?
And is there any difference between "${}" compared to { } as far as the effect is concerned?

@mahesh-panchal
Copy link
Member

    def image = "mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:676c5bcfe34af6097728fea60fb7ea83f94a4a5f-0"
    conda (params.enable_conda ? 'bioconda::bowtie=1.3.0 bioconda::samtools=1.15.1' : null)
    container { (params.container_registry ?: 'quay.io' ) + image }

Thoughts? I think this is clean.

I'm alright with it. The issue now is to think what this breaks. like pre-image download in nf-core tools.

@apeltzer
Copy link
Member

apeltzer commented Oct 13, 2022

Opened an issue here too - it will also "break" e.g. nf-core modules create in terms of that the biocontainers API will return a queried string with a full path to the container to use --> we can fix that to make it conveniently split base_url and the image id matching what is implemented in this PR, but should not forget to do so once this is merged ;-) -- see: nf-core/tools#1948

@grst
Copy link
Member

grst commented Oct 13, 2022

container { (params.container_registry ?: 'quay.io' ) + image }

Slash after quay.io missing?

@edmundmiller edmundmiller changed the title refactor: Support docker_url in modules refactor: Support container_registry in modules Oct 13, 2022
@edmundmiller
Copy link
Contributor Author

container { (params.container_registry ?: 'quay.io' ) + image }

Slash after quay.io missing?

It's in the test config for now. Idk which place is better to put it, probably in the module?

Also thoughts on quay.io vs quay.io/biocontainers as a default?

@mahesh-panchal
Copy link
Member

If the string quotes are not needed, then you don't need the { } either.

I think one of the two is needed when using custom config files with -c custom.config to delay the evaluation until the params are ready. But I heard -c is discouraged nowadays anyway? And is there any difference between "${}" compared to { } as far as the effect is concerned?

The "${ }" is a groovy expression inside a GString, { } is a closure/anonymous function (and in the context above a function that's just returning a string). The deferring of evaluation is only relevant from configuration files, so since this is from the script context, it does not have an effect.

@apeltzer
Copy link
Member

If we do this change entirely, that means also everyone has to use a separate singularity profile to keep pipelines from converting docker -> singularity images. Something to document to tell users upfront what is happening as well as to bear in mind for potential crashes when some (unaware) users now have to rely on the docker->singularity path again. Had quite a number of people with e.g. squashfs not installed issues in the past 👍🏻

Otherwise looking good to me in general (didn't go through 600+ modules).

pinin4fjords pushed a commit to pinin4fjords/nf-core-modules that referenced this pull request Oct 17, 2022
pinin4fjords pushed a commit to pinin4fjords/nf-core-modules that referenced this pull request Oct 17, 2022
@edmundmiller
Copy link
Contributor Author

They could just manually set the container image if they wanted the docker image right?

@mahesh-panchal
Copy link
Member

Not necessarily. params.container_registry could also be set in institutional profiles as well as common profiles like singularity. So both common use cases are covered. In cases where using neither profile are used then the user would likely be specifying their own config anyway.

@apeltzer
Copy link
Member

Should we then set the default of the singularity profile in nf-core to the depot.galaxy... url? I'm happy with the changes and agree that most of this should not cause issues also giving @mahesh-panchal 's comment above now too 👍🏻

@mahesh-panchal
Copy link
Member

Yes, I think that should be the plan, but also any institutional configs that enable containers need to also be updated to use the parameter too.

  • update template profiles with params.container_registry
  • update institutional configs with params.container_registry.

@apeltzer
Copy link
Member

Would vote for merging this then after some sanity checks? 🤠

@maxulysse
Copy link
Member

Would vote for merging this then after some sanity checks? 🤠

I vote for updating template and releasing tools and then merging this PR

@apeltzer
Copy link
Member

Would vote for merging this then after some sanity checks? 🤠

I vote for updating template and releasing tools and then merging this PR

Or that way around ;-)

@edmundmiller
Copy link
Contributor Author

I updated the template, and then realized I need to update the linter as well. Woof. 🙃

edmundmiller and others added 2 commits October 18, 2022 08:16
find modules -name 'main.nf' -exec sh -c 'cat -s $0 > $0_new' {} \;
find modules -name 'main.nf' -exec rm {} \;
fd main.nf_new modules -x rename '_new' '' -v \;
pinin4fjords pushed a commit that referenced this pull request Oct 25, 2022
* Add test data to config

* Add shinyngs app creation module

* Add secrets

* Complete meta.yml

* Remove trailing whitespace

* prettify

* Point at documentation for optional arguments

* Modernise container definition assuming #2291 goes ahead

* Revert "Modernise container definition assuming #2291 goes ahead"

This reverts commit f1c966b.

* Fix comment

* comment secrets until I figure out how to make them optional (if it's even possible)

* Document secrets usage for shinyapps.io

* Update version of r-shinyngs

* Appease eclint

* Fix mistake in docker container string

* Head off meta map comment from reviewers

* Rename repeated metas

* output app files as single tuple

* Describe meta fields

* appease eclint

* Simplify input channels, shift to shinyngs with bug fixed, fix tests

* spacing tweak

Co-authored-by: nvnieuwk <[email protected]>

* Be explicit with --contrast_stats_assay 1 in ext.args

* Update module meta.yaml

* Correct sample sheet variable name

Co-authored-by: nvnieuwk <[email protected]>
@mahesh-panchal
Copy link
Member

... (and in the context above a function that's just returning a string). The deferring of evaluation is only relevant from configuration files, so since this is from the script context, it does not have an effect.

So I just discovered this is not quite true, and oddly enough from the person I was explaining it to.
https://grst.github.io/bioinformatics/2020/11/28/low-level-nextflow-hacking.html

@ewels
Copy link
Member

ewels commented Nov 3, 2022

Some potential problems:

  • Assumes that BioContainers always has the same base for docker + singularity. This would now be set at pipeline level and not module level, so if they start doing piecemeal different providers in the future, everything will break.
    • eg. if BioContainers moves just FastQC to DockerHub, we have no way of dealing with this.
  • Need to update tools in a way that it will work with both old versions of the syntax and new. It will take a long time for pipelines to update to this syntax (years).

I do not think that we should rush into merging this refactoring. It's a big change and it will cause a lot of pain if it goes ahead before we're fully ready.

@ewels ewels added the WIP Work in progress label Nov 3, 2022
@mahesh-panchal
Copy link
Member

mahesh-panchal commented Nov 3, 2022

Assumes that BioContainers always has the same base for docker + singularity. This would now be set at pipeline level and not module level, so if they start doing piecemeal different providers in the future, everything will break.

* eg. if BioContainers moves _just_ FastQC to DockerHub, we have no way of dealing with this.

The solution would be to change the default path in the module, but that would also mean needing to write a custom config with process specific container overwrite too as well as the params if needed to change it.

@edmundmiller
Copy link
Contributor Author

I do not think that we should rush into merging this refactoring. It's a big change and it will cause a lot of pain if it goes ahead before we're fully ready.

Agreed!

I don't think we're going to be able to cover every possible case before they come up, so I vote we cross that bridge when we get there on the first point. But feel free to drop concerns of things that could come up!

I think we're waiting on nextflow-io/nextflow#3340 just so we're on the same page, so this doesn't sit for a year or two in limbo

I think for the moment if anyone really wants ECR support is to add a pipeline specific aws_tower config. Since the mulled containers aren't on there, I don't know how much benefit they'd get anyway.

@muffato
Copy link
Member

muffato commented Nov 3, 2022

Assumes that BioContainers always has the same base for docker + singularity. This would now be set at pipeline level and not module level, so if they start doing piecemeal different providers in the future, everything will break.

  • eg. if BioContainers moves just FastQC to DockerHub, we have no way of dealing with this.

And the fact is that BioContainers are already across quay.io and DockerHub. Legitimate modules may well refer to DockerHub for BioContainers

@edmundmiller
Copy link
Contributor Author

Closing for now in favor of something along the lines of https://github.com/adamrtalbot/nextflow-customregistry

Leaving the branch just in case.

anoronh4 pushed a commit to anoronh4/add_submodule that referenced this pull request Apr 27, 2023
* Add test data to config

* Add shinyngs app creation module

* Add secrets

* Complete meta.yml

* Remove trailing whitespace

* prettify

* Point at documentation for optional arguments

* Modernise container definition assuming nf-core/modules#2291 goes ahead

* Revert "Modernise container definition assuming nf-core/modules#2291 goes ahead"

This reverts commit f1c966b93553ea47122e0581bdf6b5c0ae2ea8d1.

* Fix comment

* comment secrets until I figure out how to make them optional (if it's even possible)

* Document secrets usage for shinyapps.io

* Update version of r-shinyngs

* Appease eclint

* Fix mistake in docker container string

* Head off meta map comment from reviewers

* Rename repeated metas

* output app files as single tuple

* Describe meta fields

* appease eclint

* Simplify input channels, shift to shinyngs with bug fixed, fix tests

* spacing tweak

Co-authored-by: nvnieuwk <[email protected]>

* Be explicit with --contrast_stats_assay 1 in ext.args

* Update module meta.yaml

* Correct sample sheet variable name

Co-authored-by: nvnieuwk <[email protected]>
@edmundmiller edmundmiller deleted the aws-biocontainers branch August 29, 2023 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
update module WIP Work in progress
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

8 participants