-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(SV) slim down REF column for CPX variants #4970
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4970 +/- ##
================================================
+ Coverage 60.198% 80.791% +20.593%
- Complexity 12781 17962 +5181
================================================
Files 1095 1095
Lines 64604 64594 -10
Branches 10394 10392 -2
================================================
+ Hits 38890 52186 +13296
+ Misses 21482 8384 -13098
+ Partials 4232 4024 -208
|
final Allele anchorBaseRefAlleleRear = Allele.create(refBases[refBases.length - 2], true); | ||
final SimpleInterval startAndStop = makeOneBpInterval(complexVC.getContig(), complexVC.getEnd()); | ||
final VariantContextBuilder rearIns = getInsFromOneEnd(false, firstRefSegmentIdx, startAndStop, anchorBaseRefAlleleRear, refSegmentLengths, altArrangement, true); | ||
final int pos = complexVC.getEnd(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it the end of the VC here and not the start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason was because this is for extracting rear insertion.
Example:
SEGMENTS=chr1:100-200,chr1:200-300,chr1:300-400,chr1:400-500
ALT_ARRANGEMENT=chrOther:100-158,1,-3,4,UINS-433
Here we would emit four records:
a front insertion of 59 bases at position chr1:100, a deletion of chr1:200-300, an inversion of chr1:300-400, and a rear insertion of 433 bases at position of chr1:500.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, just one little question about the start position of the variant intervals.
* before all bases from affected region is extracted, leading to bloated VCF, now only the anchor base * also fixes downstream CPX variant re-interpreter
31ac4af
to
04a76a2
Compare
Made a stupid decision to output all bases of the affected reference region, which significantly increased the VCF file size.
Now we follow the example for
DEL
variants, i.e. we only put the reference base in POS in the REF column.