Explicit subsampling step in rnaseq pipeline #1097

ewallace · 2023-10-17T14:11:33Z

Description of feature

Subsampling seq data before running a pipeline is good practice to test configurations and fail fast. Allowing the user to subsample the input data before running the entire pipeline, would provide a quicker in-line way to validate that the pipeline runs, troubleshoot, and check inputs.

I would like to request optional subsampling as a feature, I think it will save a lot of people a lot of time. Yes, it's possible for users to manually subsample data and then feed that in to the pipeline, but that seems to be against the nextflow spirit. Having this option inline would let users test-run the pipeline with --subsample-reads 100000 then test everything within minutes, followed by editing that one parameter to run on all the input data.

Probably it's achievable with fq subsample.

Note that the current (v3.12.0) "subsample" step does not do that, see issue #1095.

Issue #1096 suggests a different workaround only if using FastP for alignment.

The text was updated successfully, but these errors were encountered:

drpatelh · 2024-05-29T10:57:48Z

Could be solved by #1096

ewallace added the enhancement label Oct 17, 2023

drpatelh added this to the 3.15.0 milestone May 13, 2024

pinin4fjords modified the milestones: 3.15.0, 3.16.0 May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit subsampling step in rnaseq pipeline #1097

Explicit subsampling step in rnaseq pipeline #1097

ewallace commented Oct 17, 2023

drpatelh commented May 29, 2024

Explicit subsampling step in rnaseq pipeline #1097

Explicit subsampling step in rnaseq pipeline #1097

Comments

ewallace commented Oct 17, 2023

Description of feature

drpatelh commented May 29, 2024