Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add SNP and INDEL modes to VariantFiltration similar to ApplyVQSR #5527

Open
sooheelee opened this issue Dec 17, 2018 · 5 comments

Comments

@sooheelee
Copy link
Contributor

sooheelee commented Dec 17, 2018

Feature request

It would be great if VariantFiltration could hard filter with -mode SNP and -mode INDEL like ApplyVQSR. These modes do not currently exist in VariantFiltration.

Currently, the hard-filtering recommendation is to subset the VCF to separate SNPs-only and INDEL-only VCFs and then to hard-filter with VaraintFiltration using different thresholds for each type. When using SelectVariants to subset to either SNP or INDEL type variants, we lose the MIXED-type sites. It would be great to enable researchers to hard-filter without loss of such records or loss of combined information in a record when a multiallelic site is dispersed into multiple biallelic records.

The ability to apply filtering thresholds to different variant types in a VCF already exists in the codebase in ApplyVQSR. Here is an example command for ApplyVQSR.

gatk --java-options "-Xmx5g -Xms5g" \
ApplyVQSR \
-V cohort_excesshet.vcf.gz \
--recal-file cohort_indels.recal \
--tranches-file cohort_indels.tranches \
--truth-sensitivity-filter-level 99.7 \
--create-output-variant-index true \
-mode INDEL \
-O indel.recalibrated.vcf.gz

Tool(s) or class(es) involved

VariantFiltration


@sooheelee
Copy link
Contributor Author

@droazen would this be possible? Thanks.

@sooheelee
Copy link
Contributor Author

In my opinion, this feature has become pertinent for VariantFiltration because our callers now output spanning deletions starting with v4.0.9.0 (see #4963). When we separate out SNP type and INDEL type variants, any associated spanning deletions are lost and cannot be recovered. This loss of information is undesirable and it would be great if we could hard-filter a callset without having to separate out records of different variant types.

@sooheelee sooheelee changed the title Add SNP and INDEL modes to VariantFiltration similar to ApplyVQSR Feature Request: Add SNP and INDEL modes to VariantFiltration similar to ApplyVQSR Jan 11, 2019
@thierrygrange
Copy link

I found this feature request highly interesting. This feature would indeed be very useful. It does not seems it has been implemented yet. Is there a chance it will be one day? I am joining to Soo Hee Lee to support her Feature Request.

@brepley
Copy link

brepley commented Jul 8, 2021

I would also find this feature helpful in terms of streamlining a pipeline I am working on. Any updates on this feature request?

@meganshand
Copy link
Contributor

I believe you can already do this with JEXL expressions. For example -filter 'vc.isSNP() && QD<4.0' --filter-name 'SNP_low_QD'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants