Add `sample_name` column in samplesheet compatibility #19

sgsutcliffe · 2024-11-06T21:31:34Z

Modified the template for input samplesheet.csv file to include the sample_name column in addition to sample in-line with changes to IRIDA-Next update as seen with the speciesabundance pipeline and staramrnf. What this means is that the output files and the sample name will be changed to sample_name if a sample_name is called. If ftechdatairidanext is being locally then the sample_name can be left blank.

Made a few changes:
- If sample_name is provided it will be prefixed to reads file name
- If sample_name is provided it will also be included in failure report (if generated)

PR checklist

sgsutcliffe · 2024-11-06T21:36:51Z

SOLVED

Wanted to push the changes to see if an issue I had testing locally would be replicated. It looks like there is an issue with fasterq-dump possibly prefetch where the command.sh runs, and running the command via nextflow run main.nf works for test("Include sample_name in samplesheet") .

sgsutcliffe · 2024-11-07T19:30:59Z

It turns out the issue was with fasterq-dump. The issue is documented here and the solution is to roll back the version of fasterq-dump. As was done for fetchngs PR 261.

github-actions · 2024-11-07T19:51:53Z

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit da65a5a

+| ✅ 126 tests passed       |+
#| ❔  32 tests were ignored |#
!| ❗   5 tests had warnings |!

❗ Test warnings:

files_exist - File not found: conf/igenomes_ignored.config
nextflow_config - nf-validation has been detected in the pipeline. Please migrate to nf-schema: https://nextflow-io.github.io/nf-schema/latest/migration_guide/
nextflow_config - Config manifest.version should end in dev: 1.2.0
schema_lint - Schema $id should be https://raw.githubusercontent.com/phac-nml/fetchdatairidanext/master/nextflow_schema.json
Found https://raw.githubusercontent.com/phac-nml/fetchdatairidanext/main/nextflow_schema.json
nfcore_yml - nf-core version not set in .nf-core.yml

❔ Tests ignored:

files_exist - File is ignored: assets/nf-core-fetchdatairidanext_logo_light.png
files_exist - File is ignored: docs/images/nf-core-fetchdatairidanext_logo_light.png
files_exist - File is ignored: docs/images/nf-core-fetchdatairidanext_logo_dark.png
files_exist - File is ignored: .github/workflows/awstest.yml
files_exist - File is ignored: .github/workflows/awsfulltest.yml
files_exist - File is ignored: CODE_OF_CONDUCT.md
files_exist - File is ignored: lib/Utils.groovy
files_exist - File is ignored: lib/WorkflowMain.groovy
files_exist - File is ignored: lib/NfcoreTemplate.groovy
files_exist - File is ignored: lib/WorkflowFetchdatairidanext.groovy
nextflow_config - Config variable ignored: manifest.name
nextflow_config - Config variable ignored: manifest.homePage
nextflow_config - Config variable ignored: params.max_cpus
files_unchanged - File does not exist: CODE_OF_CONDUCT.md
files_unchanged - File ignored due to lint config: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/feature_request.yml
files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
files_unchanged - File ignored due to lint config: .github/workflows/branch.yml
files_unchanged - File ignored due to lint config: .github/workflows/linting.yml
files_unchanged - File ignored due to lint config: assets/email_template.html
files_unchanged - File ignored due to lint config: assets/email_template.txt
files_unchanged - File ignored due to lint config: assets/sendmail_template.txt
files_unchanged - File does not exist: assets/nf-core-fetchdatairidanext_logo_light.png
files_unchanged - File does not exist: docs/images/nf-core-fetchdatairidanext_logo_light.png
files_unchanged - File does not exist: docs/images/nf-core-fetchdatairidanext_logo_dark.png
files_unchanged - File ignored due to lint config: docs/README.md
files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/fetchdatairidanext/fetchdatairidanext/.github/workflows/awstest.yml
actions_awsfulltest - actions_awsfulltest
pipeline_name_conventions - pipeline_name_conventions

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: modules.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-fetchdatairidanext_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Found nf-validation plugin
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.rename_with_samplename= true
nextflow_config - Config default value correct: params.max_cpus= 4
nextflow_config - Config default value correct: params.max_memory= 2.GB
nextflow_config - Config default value correct: params.max_time= 1.h
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.max_jobs_with_network_connections= 1
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
readme - README Zenodo placeholder was replaced with DOI.
pipeline_todos - No TODO strings found
plugin_includes - No wrong validation plugin imports have been found
template_strings - Did not find any Jinja template strings (0 files)
schema_lint - Schema lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - assets/multiqc_config.yml found and not ignored.
multiqc_config - assets/multiqc_config.yml contains report_section_order
multiqc_config - assets/multiqc_config.yml contains export_plots
multiqc_config - assets/multiqc_config.yml contains report_comment
multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
base_config - CUSTOM_DUMPSOFTWAREVERSIONS found in conf/base.config and Nextflow scripts.
modules_config - conf/modules.config found and not ignored.
modules_config - CUSTOM_DUMPSOFTWAREVERSIONS found in conf/modules.config and Nextflow scripts.
modules_config - SRATOOLS_PREFETCH found in conf/modules.config and Nextflow scripts.
modules_config - SRATOOLS_FASTERQDUMP found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline

Run details

nf-core/tools version 3.0.2
Run at 2024-11-22 01:44:34

…r sample_name

sgsutcliffe · 2024-11-14T21:34:33Z

It's working in IRIDA-Next and the parameter is working!

Example

kylacochrane

Amazing Job Steven! I love the option to rename the accession files - this will help users keep track of all their files in IRIDA Next!!

I have a few minor suggestions in the review.

I also have one comment about when I tested in IRIDA Next:
When I keep the --rename_with samplename default set to true everything works perfectly.
However, when I set the --rename_with_samplename to false in IRIDA Next, I end up with three fastq files downloaded.

Which is odd and may cause downstream pipeline confusion (i.e. it becomes an option for selection in mikrokondo long reads input options ... )

README.md

workflows/fetchdatairidanext.nf

sgsutcliffe · 2024-11-19T18:15:45Z

Amazing Job Steven! I love the option to rename the accession files - this will help users keep track of all their files in IRIDA Next!!

I have a few minor suggestions in the review.

I also have one comment about when I tested in IRIDA Next: When I keep the --rename_with samplename default set to true everything works perfectly. However, when I set the --rename_with_samplename to false in IRIDA Next, I end up with three fastq files downloaded. Which is odd and may cause downstream pipeline confusion (i.e. it becomes an option for selection in mikrokondo long reads input options ... )

Good eye @kylacochrane! The issue seems to be due to the fact that it is a accession with both paired and single-end reads (see th ENA) as @apetkau suggested. So the question remains, "Why is renaming the files with --output in fasterq-dump breaking?". Looking in the nextflow work directory the reads for the single-end reads get downloaded but without a *.fastq extension when using --output. It seems to be related to this sra-tools issue

I decided to modify the module (we already deviated away from the original nf-core module) here dc85968

apetkau

Thanks so much for all your great work @sgsutcliffe . This is amazing 😄 .

I just have a few in-line comments and I also am going to run this with some test data.

modules/local/sratools/fasterqdump/main.nf

modules/local/prefetchchecker/main.nf

tests/pipelines/fetchdatairidanext.nf.test

apetkau

Thanks so much for all your work on this Steven 😄

In response to Kyla's comment, the expected behaviour is to include all 3 fastq files in the case of renaming with sample names and not renaming. Since one of the fastq files is unpaired reads.

This is all working now, as shown in the screenshot below:

Everything is working great. Thanks again 😄

kylacochrane

Looks amazing Steven! Well done!

README.md

assets/schema_input.json

docs/usage.md

Add sample_name test is still not working locally

c964c80

sgsutcliffe added 3 commits November 7, 2024 14:37

Fixes isses with fasterq-dump

101bbe5

Added another test, test read names with sample_names

5185a75

Fix nfcore lint issues

46bb5fa

sgsutcliffe added 12 commits November 7, 2024 15:10

Modified linting with editorconfig-checker

0849cc7

Fix lint comment yml

28fa88f

Providing a NCBI user setting file for tests

f025b1b

Providing a NCBI user setting file for tests

201d5e4

Somewhere along the way prefetch got deleted

b55683d

Add sample_name removal of non-alphanumeric characters and nf-test fo…

9e1e92a

…r sample_name

update documenantation, and param to override sample_name rename

d6093db

Update usage.md

f1a50f3

Add new param to schema

0dd868a

Updating UI for IRIDA-Next

8206458

Made it pretty

c877cd4

Improved description for IRIDA-Next parameter

86bccf9

Update version for minor release

1238b14

sgsutcliffe requested review from apetkau, emarinier and kylacochrane November 15, 2024 15:20

kylacochrane requested changes Nov 19, 2024

View reviewed changes

README.md Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

workflows/fetchdatairidanext.nf Show resolved Hide resolved

apetkau requested changes Nov 19, 2024

View reviewed changes

modules/local/sratools/fasterqdump/main.nf Outdated Show resolved Hide resolved

modules/local/prefetchchecker/main.nf Outdated Show resolved Hide resolved

tests/pipelines/fetchdatairidanext.nf.test Outdated Show resolved Hide resolved

sgsutcliffe added 3 commits November 20, 2024 09:55

Addressing accessions with both pair and single end reads

dc85968

Updated tests for previous commit

3f54040

Clarify text

5ad42b5

sgsutcliffe added 4 commits November 20, 2024 12:28

Clean up conditional operator

02ecdb5

Fixed container URL

18f3eb3

Improved readibility

261f455

return the test, but remove section not used

09ea6f0

apetkau approved these changes Nov 20, 2024

View reviewed changes

kylacochrane approved these changes Nov 20, 2024

View reviewed changes

emarinier requested changes Nov 20, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

assets/schema_input.json Outdated Show resolved Hide resolved

docs/usage.md Show resolved Hide resolved

docs/usage.md Outdated Show resolved Hide resolved

docs/usage.md Outdated Show resolved Hide resolved

sgsutcliffe added 4 commits November 21, 2024 20:32

Claify parameter description

3ba740e

Sample_name description

dbe8bf6

Missing filepath

36f4bbb

Fixed wording

da65a5a

emarinier approved these changes Nov 22, 2024

View reviewed changes

sgsutcliffe merged commit aa8d37f into dev Nov 22, 2024
5 checks passed

sgsutcliffe deleted the add-sample-name branch November 22, 2024 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `sample_name` column in samplesheet compatibility #19

Add `sample_name` column in samplesheet compatibility #19

sgsutcliffe commented Nov 6, 2024 •

edited

Loading

sgsutcliffe commented Nov 6, 2024 •

edited

Loading

sgsutcliffe commented Nov 7, 2024

github-actions bot commented Nov 7, 2024 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

sgsutcliffe commented Nov 14, 2024

kylacochrane left a comment

sgsutcliffe commented Nov 19, 2024 •

edited

Loading

apetkau left a comment

apetkau left a comment

kylacochrane left a comment

Add sample_name column in samplesheet compatibility #19

Add sample_name column in samplesheet compatibility #19

Conversation

sgsutcliffe commented Nov 6, 2024 • edited Loading

PR checklist

sgsutcliffe commented Nov 6, 2024 • edited Loading

SOLVED

sgsutcliffe commented Nov 7, 2024

github-actions bot commented Nov 7, 2024 • edited Loading

nf-core pipelines lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

sgsutcliffe commented Nov 14, 2024

Example

kylacochrane left a comment

Choose a reason for hiding this comment

sgsutcliffe commented Nov 19, 2024 • edited Loading

apetkau left a comment

Choose a reason for hiding this comment

apetkau left a comment

Choose a reason for hiding this comment

kylacochrane left a comment

Choose a reason for hiding this comment

Add `sample_name` column in samplesheet compatibility #19

Add `sample_name` column in samplesheet compatibility #19

sgsutcliffe commented Nov 6, 2024 •

edited

Loading

sgsutcliffe commented Nov 6, 2024 •

edited

Loading

github-actions bot commented Nov 7, 2024 •

edited

Loading

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

sgsutcliffe commented Nov 19, 2024 •

edited

Loading