-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HaplotypeCaller GGA mode crashes with certain spanning deletions #5336
Comments
I've been taking a look this morning but am so far unable to reproduce the problem. I tried adding lines to the integration test file to mimic having a SNP at the last base of a spanning deletion as in the bug report:
But I'm not hitting this error. @gmagoon any chance you could provide any more information that would help us reproduce this? The exact command line you're using might help. Do you still get the error if you run with an alleles file consisting only of one of the pairs of given allele lines listed above? |
Good question @cwhelan
On the other hand, I confirmed that the case with the original, full I don't think there is anything particularly special about the command line I'm using, except maybe the I will try to play around with the alleles input to get a small set where I can reproduce the issue... |
OK, I think I've narrowed down the problem. Curiously, it seems to be related to the inclusion of genotypes in the
|
Good catch @gmagoon . As part of this fix we have code that replaced deletion alleles with star alleles in overlapping variants before genotyping; evidently if the input variants have preexisting genotypes the original alleles are still attached to the variants and are making it downstream through that filter. I'll try to get a fix in for this shortly. Thanks for your help in narrowing down the cause. |
Great, thanks very much @cwhelan...that was fast! |
Using GENOTYPE_GIVEN_ALLELES ("GGA") mode with HaplotypeCaller, I've encountered a couple instances of crashes that I've traced to spanning deletions (of the type considered in #4963).
One case involved the following in the
--alleles
input:and it crashed with:
A second case involved
--alleles
input:and crashed similarly, with:
Based on the discussion around #4963 and the test VCF, I gather that this is intended to work without error.
I was trying to figure out how these cases differed from the spanning deletion in the aforementioned test VCF. One thing I noticed was that these two problematic cases have the SNP at the very last base of the spanning deletion. I'm just speculating here, but maybe it is related to an off-by-one bug of some sort?
I am testing with v. 4.0.9.0.
I also tried with v. 4.0.5.1 which does not crash, but rather prints the warnings discussed in #4963:
00:02:10.995 WARN HaplotypeCallerEngine - Multiple valid VCF records detected in the alleles input file at site 22:16137302-16137302, only considering the first record
00:03:08.220 WARN HaplotypeCallerEngine - Multiple valid VCF records detected in the alleles input file at site 22:16464051-16464051, only considering the first record
The text was updated successfully, but these errors were encountered: