-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip MarkDuplicates when UMIs are used #891
Comments
@MatthiasZepper what are your thoughts on this? The easiest option is to never run Markduplicates when |
I indeed ran some comparative analyses, but most by manually writing bash scripts, because I didn't trust myself to really understand all the minute details of the pipeline. I get, why the channel name Since both, Biologically, there is indeed little use of running MarkDuplicates after umi-tools dedup, although both tools do slightly differ in their strategies. Lacking UMI information, MarkDuplicates can't differentiate biological and technical duplication, but can spot optical duplicates specifically by means of their position when provided with appropriate parameters for the flow cell type and instrument via In summary: Weak agreement. I do not really see a use case for running both tools, and agree that most users intuitively expect it to be an either-or -scenario between the two. On the other hand, putting that additional entry in the |
Yes, if we have explicit names for all of these channels it could quite equally get complicated tracing back the original input/output channels. This easily allows us to add/remove additional aligners or other processes with minimal effort. It's a double edge sword but I see where you are coming from.
Ok. I will hard-code the option to skip picard Markduplicates if UMIs are present. It can always be run outside of the pipeline if required in edge case scenarios. |
Fixed in #911 |
Objection, your honor! Dupradar needs preprocessing by marking duplicates! |
Description of feature
Hi,
I would suggest disabling Picard MarkDuplicates when UMIs are used for deduplication. For example,
--skip_markduplicates
can be enabled by default if--with_umi
was also enabled.The text was updated successfully, but these errors were encountered: