-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to Enable SelectVariants to drop sites with * allele as only ALT #5129
Conversation
Working on completing a test in SelectVariantsIntegrationTest
Working on completing a test in SelectVariantsIntegrationTest Work in progress
Work in progress Including test to ensure alleles that are only monomorphic, in respect to samples included, will have lines deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix Katie! I like the expected behavior, but I think one of your tests is not what you intended.
@Test | ||
public void testRemoveMonomorphAfterSNSelect() throws IOException { | ||
final String testFile = getToolTestDataDir() + "spanning_deletion.vcf"; | ||
final String samplesFile = "NA1"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really this is a sample name rather than a samplesFile so can you change the variable name?
baseTestString(" -sn " + samplesFile + " --remove-unused-alternates --exclude-non-variants", testFile), | ||
Collections.singletonList(getToolTestDataDir() + "expected/" + "testSelectVariants_RemoveSingleSpanDelAlleleNoSpanDel.vcf") | ||
); | ||
spec.executeTest("test will not remove variant line where '*' is only ALT allele because none exist" + testFile, this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this is different from the above test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to use a different input file where the * is in a heterozygous genotype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake. They are the same. Turns out the test technically tests two things at once, I deleted the second instance.
@@ -569,8 +569,8 @@ public void apply(VariantContext vc, ReadsContext readsContext, ReferenceContext | |||
} | |||
final VariantContext filteredGenotypeToNocall = setFilteredGenotypesToNocall ? builder.make(): sub; | |||
|
|||
// Not excluding non-variants or subsetted polymorphic variants AND including filtered loci or subsetted variant is not filtered | |||
if ((!XLnonVariants || filteredGenotypeToNocall.isPolymorphicInSamples()) && (!XLfiltered || !filteredGenotypeToNocall.isFiltered())) { | |||
// Not excluding non-variants OR (subsetted polymorphic variants AND not spanning deletion) AND (including filtered loci OR subsetted variant) is not filtered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still really hard to read. Maybe give examples of things that will pass here?
@ldgauthier Thanks for the comments! I made the changes you requested and deleted the redundant test. Let me know if you have any more edits in mind. |
Codecov Report
@@ Coverage Diff @@
## master #5129 +/- ##
===============================================
+ Coverage 86.655% 86.668% +0.013%
- Complexity 29046 30619 +1573
===============================================
Files 1808 1834 +26
Lines 134686 140269 +5583
Branches 14938 15772 +834
===============================================
+ Hits 116712 121568 +4856
- Misses 12559 13160 +601
- Partials 5415 5541 +126
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
@vdauwera Is Robert on github? Is this (i.e. expected behavior, arg name review, etc.) going to be one of his responsibilities going forward? |
@sooheelee Could you review? Thanks! |
Sure @kvinter1, I can review. |
I just tested your branch @kvinter1 with some data at hand with the following command:
And when I grep for the spanning deletion with
I assume testing functionality is what you meant for review? In terms of updating the JavaDoc portion, is this something you can write a draft of, e.g. a summary of the functionality implemented. I noticed there were no changes to the doc portions. Your synopsis is something I can then review for clarity and style. P.S. I should mention otherwise your branch removed 511 such unwanted records. Nice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left in the conversation thread.
@sooheelee can you share that input file with Katie so she can add the het-non-ref case to the integration tests? |
It is publically available in the GATK workshop bundle but here it is for convenience: |
Just learned |
Attempt to eliminate lines in vcf with only * as ALT when using SelectVariants (change to SelectVariants). Also includes addition of new tests in SelectVariantsIntegrationTest to ensure this is true over many variations of arguments / assignments.