Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Longread only functionality #718

Open
wants to merge 39 commits into
base: dev
Choose a base branch
from
Open

Conversation

muabnezor
Copy link
Contributor

@muabnezor muabnezor commented Nov 28, 2024

This PR adds long-read only functionality to mag.

closes #662, #659, #275

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mag branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@muabnezor muabnezor added the WIP Work in progress label Nov 28, 2024
@muabnezor muabnezor requested a review from jfy133 November 28, 2024 12:28
Copy link

github-actions bot commented Nov 28, 2024

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.1.0.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@muabnezor muabnezor changed the base branch from master to dev November 28, 2024 12:29
Copy link
Collaborator

@d4straub d4straub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only had a short look, sorry, but I just wanted to express my gratitude that you tackle long read assembly!

nextflow.config Show resolved Hide resolved
conf/modules.config Outdated Show resolved Hide resolved
@muabnezor
Copy link
Contributor Author

I only had a short look, sorry, but I just wanted to express my gratitude that you tackle long read assembly!

My pleasure hehe. Still WIP. I have to tidy up the code and run some validation on real data, but we're getting there!

@muabnezor
Copy link
Contributor Author

So far in this PR I have

  • changed the validation schema to allow for long read only input
  • Added a host removal track for long reads using minimap2 as aligner
  • Added Flye and metaMDBG as longread assemblers
  • refactoring of code so that all the assembly code is moved to a subworkflow

Downstream from the assembly, except for the binning preparation, the long read and short read assemblies are treated the same.

@muabnezor muabnezor removed the WIP Work in progress label Dec 12, 2024
@muabnezor
Copy link
Contributor Author

Me and some colleagues are running some tests on ont data, and I think this works as expected. We will continue testing and play around with the parameters, but if anyone wants to have a look through the code and suggest improvements, that would be awesome!

@jfy133 jfy133 linked an issue Jan 20, 2025 that may be closed by this pull request
Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'm not seeing anything obvious! But I would still like to run a test (can't do that today).

Once commetns (mostly questions) addressed, I will start the manual tests :)

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CITATIONS.md Outdated Show resolved Hide resolved
CITATIONS.md Outdated Show resolved Hide resolved
BOWTIE2_ASSEMBLY_BUILD ( assemblies )
ch_versions = Channel.empty()
ch_multiqc_files = Channel.empty()
// multiple symlinks to the same assembly -> use first of sorted list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// multiple symlinks to the same assembly -> use first of sorted list
// multiple symlinks to the same assembly -> use first of sorted list

What is this comment referring to, it sounds a bit scary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a lot of copy-pasting here from original binning_preparation.nf, forgot to remove this comment :) the comment is referring to line 47 in shortread_binning_preparation.nf if you are interest. I did not write this line of code though.

ch_minimap2_input_idx = ch_minimap2_input
.map { meta_idx, index, meta, reads -> [ meta_idx, index ] }

MINIMAP2_ASSEMBLY_ALIGN ( ch_minimap2_input_reads, ch_minimap2_input_idx, true, 'bai', false, false )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the true/false/falses? something should be parameterasable by the user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first true: if the output should be in bam-format. I think this can be fixed
first false: If the output is set to paf, then a cigar string is generated. This only happens if output format is not bam, so does not make sense for it to be changed by user
second false: to make the cigar string backward compatible with older tools, but no need for that here.

subworkflows/local/shortread_assembly.nf Outdated Show resolved Hide resolved
workflows/mag.nf Outdated Show resolved Hide resolved
@muabnezor
Copy link
Contributor Author

thank you @jfy133. I'm away this week, but I'll try to find time to go through your comments asap.

@muabnezor
Copy link
Contributor Author

Thanks for the review @jfy133. Sorry I have been away this week, but now I think I addressed all of your comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add separate Nanopore input option
4 participants