-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New qual doesn't count spanning deletions as support for variant qual #4801
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you confirm that both those tests fail in master? I pasted them into my branch and testPresenceOfUnlikelySpanningDeletionDoesntAffectResults passes, but it's possible some changes I made in my branch are affecting the results.
|
If the second test passed in master then I don't consider it to be a good test of your fix. Can you make a test where the PLs with and without span del would lead to different QUALs in the old version? Maybe a case with a confident deletion in sample1 and a low quality SNP in sample2 that causes a spanning deletion in sample1. You could probably do this by modifying your first test by adding a sample that has A or B called with low GQ with respect to the reference. |
The point of the second test is to make sure that the new code path involving the spanning deletion has no effect if the spanning deletion has low likelihood. You could imagine that I introduced a bug where you lose sensitivity whenever a spanning deletion appears in the alleles list even if there's very little evidence for it, just because its potential existence sends the code down some new, incorrect, logic. I think Now, if you think a multi-sample case would be useful that sounds good to me. However, a "confident deletion in sample1 and a low quality SNP in sample2" would not look very different since the deletion would only exist in an upstream VC, not the VC being tested, so basically we would have sample1 = REF/SPAN_DEL, sample2 = REF/SNP with low quality. The current test is the same thing with just sample1. Is this what you had in mind? |
Let's be rigorous and use two samples so that you're testing a case that could actually occur in the wild. It should be easy enough to add sample2 with PLs like [10,0,40,100,70,300]. Then can you add a comment about where the "magic number" in the assert statement comes from? Or maybe to be super duper rigorous you could compare the QUAL from that variant to the QUAL for sample3 and sample4 where 3 and 4 have the same PLs as 1 and 2, but without the spanning deletion. |
@ldgauthier I did all of those things. Note that to be precise, when you remove the spanning deletion the "het" REF / SPAN_DEL must be treated as haploid, which the new tests do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful new tests. I completely agree with the logic about the span-del het being like a haploid ref.
Codecov Report
@@ Coverage Diff @@
## master #4801 +/- ##
===============================================
+ Coverage 80.131% 80.479% +0.348%
- Complexity 17488 18032 +544
===============================================
Files 1085 1089 +4
Lines 63245 65070 +1825
Branches 10200 10630 +430
===============================================
+ Hits 50679 52368 +1689
- Misses 8579 8632 +53
- Partials 3987 4070 +83
|
@davidbenjamin I kicked Travis and now tests are green so you can merge. |
@ldgauthier The new qual was not doing the right thing for spanning deletions. This fixes it. Would you mind reviewing? And would you like me to re-run those GVCFs just in case before going ahead with #4614?