Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Biocontainers API for creating modules #875

Closed
ewels opened this issue Mar 14, 2021 · 2 comments · Fixed by #1110
Closed

Use Biocontainers API for creating modules #875

ewels opened this issue Mar 14, 2021 · 2 comments · Fixed by #1110
Labels
command line tools Anything to do with the cli interfaces

Comments

@ewels
Copy link
Member

ewels commented Mar 14, 2021

Putting down an idea into an issue for nf-core create so I don't forget (but probably too much work to get into PR #869).

Biocontainers itself has quite a nice API that we can use. It's documented here: https://api.biocontainers.pro/ga4gh/trs/v2/ui/#/GA4GH/tools_id_get

For example, we can query MultiQC:

curl -X GET "https://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc" -H  "accept: application/json"
JSON Response
{
  "contains": [],
  "description": "Multiqc aggregates results from multiple bioinformatics analyses across many samples into a single report. it searches a given directory for analysis logs and compiles a html report. i is a general use tool, perfect for summarising the output from numerous bioinformatics tools.",
  "id": "multiqc",
  "identifiers": [
    "biotools:multiqc",
    "PMID:27312411"
  ],
  "license": "GPL-3.0",
  "name": "multiqc",
  "organization": "biocontainers",
  "pulls": 3602004,
  "tool_tags": [
    "High-Throughput Nucleotide Sequencing",
    "Quality Control",
    "Computational Biology",
    "Sequencing",
    "Bioinformatics",
    "RNA-Seq",
    "Transcriptomics"
  ],
  "tool_url": "https://github.com/ewels/MultiQC",
  "toolclass": {
    "description": "CommandLineTool",
    "id": "0",
    "name": "CommandLineTool"
  },
  "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc",
  "versions": [
    {
      "id": "multiqc-1.0",
      "meta_version": "1.0",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.0"
    },
    {
      "id": "multiqc-1.5",
      "meta_version": "1.5",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.5"
    },
    {
      "id": "multiqc-1.4",
      "meta_version": "1.4",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.4"
    },
    {
      "id": "multiqc-0.9.1a0",
      "meta_version": "0.9.1a0",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-0.9.1a0"
    },
    {
      "id": "multiqc-1.3",
      "meta_version": "1.3",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.3"
    },
    {
      "id": "multiqc-1.6a0",
      "meta_version": "1.6a0",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.6a0"
    },
    {
      "id": "multiqc-1.5a",
      "meta_version": "1.5a",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.5a"
    },
    {
      "id": "multiqc-1.2",
      "meta_version": "1.2",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.2"
    },
    {
      "id": "multiqc-1.1",
      "meta_version": "1.1",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.1"
    },
    {
      "id": "multiqc-1.7",
      "meta_version": "1.7",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.7"
    },
    {
      "id": "multiqc-1.6",
      "meta_version": "1.6",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.6"
    },
    {
      "id": "multiqc-1.8",
      "meta_version": "1.8",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.8"
    },
    {
      "id": "multiqc-1.9",
      "meta_version": "1.9",
      "name": "multiqc",
      "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.9"
    }
  ]
}

Using this API call gives us several things in a single shot:

  • Version information
  • Tool description
  • Tool homepage
  • Biocontainers URL
  • Tool identifiers
  • Licence

It also gives URLs for each version which we can query (_NOTE: It lists http but this doesn't work, needs to be https).

For example, MultiQC 1.9:

curl -X GET "https://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.9" -H  "accept: application/json"
JSON Response
{
  "id": "multiqc-1.9",
  "images": [
    {
      "downloads": 48596,
      "image_name": "multiqc==1.9--pyh9f0ad1d_0",
      "image_type": "Conda",
      "registry_host": "http://anaconda.org/",
      "size": 862231,
      "updated": "2020-05-30T00:00:00Z"
    },
    {
      "downloads": 0,
      "image_name": "quay.io/biocontainers/multiqc:1.9--pyh9f0ad1d_0",
      "image_type": "Docker",
      "registry_host": "quay.io/",
      "size": 194294593,
      "updated": "2020-05-30T00:00:00Z"
    },
    {
      "image_name": "https://depot.galaxyproject.org/singularity/multiqc:1.9--pyh9f0ad1d_0",
      "image_type": "Singularity",
      "registry_host": "depot.galaxyproject.org/singularity/",
      "size": 189788160,
      "updated": "2020-05-31T04:44:00Z"
    },
    {
      "downloads": 48596,
      "image_name": "multiqc==1.9--py_1",
      "image_type": "Conda",
      "registry_host": "http://anaconda.org/",
      "size": 862231,
      "updated": "2020-05-30T00:00:00Z"
    },
    {
      "downloads": 0,
      "image_name": "quay.io/biocontainers/multiqc:1.9--py_1",
      "image_type": "Docker",
      "registry_host": "quay.io/",
      "size": 179981913,
      "updated": "2020-07-28T00:00:00Z"
    },
    {
      "image_name": "https://depot.galaxyproject.org/singularity/multiqc:1.9--py_1",
      "image_type": "Singularity",
      "registry_host": "depot.galaxyproject.org/singularity/",
      "size": 176119808,
      "updated": "2020-07-29T06:19:00Z"
    }
  ],
  "meta_version": "1.9",
  "name": "multiqc",
  "url": "http://api.biocontainers.pro/ga4gh/trs/v2/tools/multiqc/versions/multiqc-1.9"
}

This gives us:

  • Details of each build for a given version (as there can be many)
  • Conda, Docker and Singularity URLs in one shot (with no guessing)
  • Doesn't assume a given image registry (nearly everything is quay.io now, but that could change in the future)
  • Metadata: updated timestamps, size, download

My thought is that we could query this when running nf-core modules create instead of bioconda / quay.io. I think that this would be more accurate as well as giving us a bunch of additional information to put into meta.yml about the tool.

Ideally, we could use either use an exact build tag provided on the command line (fail if not found) or use a questionary select list as done in nf-core launch. This would be very precise (select only from first versions, then builds that are available) and also super user-friendly.

Phil

@ewels ewels added command line tools Anything to do with the cli interfaces DSL2 labels Mar 14, 2021
@ewels ewels mentioned this issue Mar 14, 2021
4 tasks
@ewels
Copy link
Member Author

ewels commented Mar 21, 2021

Major issue here is that the BioContainers website / API seems to lag well behind reality and not contain many packages. Need to investigate why this is before we can use it.

@KevinMenden KevinMenden mentioned this issue Jun 15, 2021
4 tasks
@KevinMenden KevinMenden linked a pull request Jun 15, 2021 that will close this issue
4 tasks
@KevinMenden
Copy link
Contributor

Closing this now - should be re-opened though if we want to have a more fancy way of selecting containers/versions in nf-core modules create. Or maybe a new issue then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
command line tools Anything to do with the cli interfaces
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants