Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sample_name column in samplesheet compatibility #19

Merged
merged 28 commits into from
Nov 22, 2024
Merged

Conversation

sgsutcliffe
Copy link
Contributor

@sgsutcliffe sgsutcliffe commented Nov 6, 2024

Modified the template for input samplesheet.csv file to include the sample_name column in addition to sample in-line with changes to IRIDA-Next update as seen with the speciesabundance pipeline and staramrnf. What this means is that the output files and the sample name will be changed to sample_name if a sample_name is called. If ftechdatairidanext is being locally then the sample_name can be left blank.

Made a few changes:
- If sample_name is provided it will be prefixed to reads file name
- If sample_name is provided it will also be included in failure report (if generated)

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Add nf-test to test new feature
  • Usage Documentation in docs/usage.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).
  • Tested out in IRIDA-Next locally
  • Tested out in Azure

@sgsutcliffe
Copy link
Contributor Author

sgsutcliffe commented Nov 6, 2024

SOLVED

Wanted to push the changes to see if an issue I had testing locally would be replicated. It looks like there is an issue with fasterq-dump possibly prefetch where the command.sh runs, and running the command via nextflow run main.nf works for test("Include sample_name in samplesheet") .

@sgsutcliffe
Copy link
Contributor Author

It turns out the issue was with fasterq-dump. The issue is documented here and the solution is to roll back the version of fasterq-dump. As was done for fetchngs PR 261.

Copy link

github-actions bot commented Nov 7, 2024

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit da65a5a

+| ✅ 126 tests passed       |+
#| ❔  32 tests were ignored |#
!| ❗   5 tests had warnings |!

❗ Test warnings:

❔ Tests ignored:

  • files_exist - File is ignored: assets/nf-core-fetchdatairidanext_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-fetchdatairidanext_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-fetchdatairidanext_logo_dark.png
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: CODE_OF_CONDUCT.md
  • files_exist - File is ignored: lib/Utils.groovy
  • files_exist - File is ignored: lib/WorkflowMain.groovy
  • files_exist - File is ignored: lib/NfcoreTemplate.groovy
  • files_exist - File is ignored: lib/WorkflowFetchdatairidanext.groovy
  • nextflow_config - Config variable ignored: manifest.name
  • nextflow_config - Config variable ignored: manifest.homePage
  • nextflow_config - Config variable ignored: params.max_cpus
  • files_unchanged - File does not exist: CODE_OF_CONDUCT.md
  • files_unchanged - File ignored due to lint config: LICENSE or LICENSE.md or LICENCE or LICENCE.md
  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/feature_request.yml
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: .github/workflows/branch.yml
  • files_unchanged - File ignored due to lint config: .github/workflows/linting.yml
  • files_unchanged - File ignored due to lint config: assets/email_template.html
  • files_unchanged - File ignored due to lint config: assets/email_template.txt
  • files_unchanged - File ignored due to lint config: assets/sendmail_template.txt
  • files_unchanged - File does not exist: assets/nf-core-fetchdatairidanext_logo_light.png
  • files_unchanged - File does not exist: docs/images/nf-core-fetchdatairidanext_logo_light.png
  • files_unchanged - File does not exist: docs/images/nf-core-fetchdatairidanext_logo_dark.png
  • files_unchanged - File ignored due to lint config: docs/README.md
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/fetchdatairidanext/fetchdatairidanext/.github/workflows/awstest.yml
  • actions_awsfulltest - actions_awsfulltest
  • pipeline_name_conventions - pipeline_name_conventions

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-11-22 01:44:34

@sgsutcliffe
Copy link
Contributor Author

It's working in IRIDA-Next and the parameter is working!

Example

image

Copy link
Collaborator

@kylacochrane kylacochrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing Job Steven! I love the option to rename the accession files - this will help users keep track of all their files in IRIDA Next!!

I have a few minor suggestions in the review.

I also have one comment about when I tested in IRIDA Next:
When I keep the --rename_with samplename default set to true everything works perfectly.
However, when I set the --rename_with_samplename to false in IRIDA Next, I end up with three fastq files downloaded.
image
Which is odd and may cause downstream pipeline confusion (i.e. it becomes an option for selection in mikrokondo long reads input options ... )

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
workflows/fetchdatairidanext.nf Show resolved Hide resolved
@sgsutcliffe
Copy link
Contributor Author

sgsutcliffe commented Nov 19, 2024

Amazing Job Steven! I love the option to rename the accession files - this will help users keep track of all their files in IRIDA Next!!

I have a few minor suggestions in the review.

I also have one comment about when I tested in IRIDA Next: When I keep the --rename_with samplename default set to true everything works perfectly. However, when I set the --rename_with_samplename to false in IRIDA Next, I end up with three fastq files downloaded. image Which is odd and may cause downstream pipeline confusion (i.e. it becomes an option for selection in mikrokondo long reads input options ... )

Good eye @kylacochrane! The issue seems to be due to the fact that it is a accession with both paired and single-end reads (see th ENA) as @apetkau suggested. So the question remains, "Why is renaming the files with --output in fasterq-dump breaking?". Looking in the nextflow work directory the reads for the single-end reads get downloaded but without a *.fastq extension when using --output. It seems to be related to this sra-tools issue

I decided to modify the module (we already deviated away from the original nf-core module) here dc85968

Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for all your great work @sgsutcliffe . This is amazing 😄 .

I just have a few in-line comments and I also am going to run this with some test data.

modules/local/sratools/fasterqdump/main.nf Outdated Show resolved Hide resolved
modules/local/prefetchchecker/main.nf Outdated Show resolved Hide resolved
tests/pipelines/fetchdatairidanext.nf.test Outdated Show resolved Hide resolved
Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for all your work on this Steven 😄

In response to Kyla's comment, the expected behaviour is to include all 3 fastq files in the case of renaming with sample names and not renaming. Since one of the fastq files is unpaired reads.

This is all working now, as shown in the screenshot below:

image

Everything is working great. Thanks again 😄

Copy link
Collaborator

@kylacochrane kylacochrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing Steven! Well done!

README.md Outdated Show resolved Hide resolved
assets/schema_input.json Outdated Show resolved Hide resolved
docs/usage.md Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
docs/usage.md Outdated Show resolved Hide resolved
@sgsutcliffe sgsutcliffe merged commit aa8d37f into dev Nov 22, 2024
5 checks passed
@sgsutcliffe sgsutcliffe deleted the add-sample-name branch November 22, 2024 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants