-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Negative values in NM field #130
Comments
@sergiigladchuk , thank you for reporting the issue! I reproduced the situation on your data and at first glance, it happens when there are indels in the cigar, but NM = 0 in the read. Mismatch value for variant is calculated now as: mismatch = NM - I - D (lh3/minimap2#25), and it will be negative in this case. Are you sure that NM tag value is correct in BAM file? I tried to validate it with Picard ValidateSamFile and it doesn't show any errors. But when I use Picard SetNmMdAndUqTags, that can fix NM value on the specific reference, NM fields in the reads that cause this error were changed to the bigger values. We will also look closely at this situation and analyze whether this is an issue. |
Thank you @PolinaBevad , for clarification. |
@sergiigladchuk, hello! We added adjusting of the NM field to zero in case of such errors in the VarDictJava and Perl versions to avoid negative values in output if they appear due to alignment problems. Release 1.5.7. (https://github.com/AstraZeneca-NGS/VarDictJava/releases/tag/v1.5.7) contains this fix. I will close this issue, please, re-open it if you will have another question on this topic. |
Good day,
I have noticed that except issue with DP being less then VD (which leads to strange AF calculation) that I hope is going to be fixed soon, NM (Mean mismatches in reads) field is getting negative values. For our RNA-seq data we have about 2% of variants with NM being negative. Please fix the issue, since we would like to use this field for false positive filtering.
Another observation is that very few negative NM fields are observed in variants produced from DNA-seq data, so I would say that this issue is RNA-seq relevant only, and might be connected to #66
Here is small test case, which produces 14 variants with negative NM out of 35 variants called from .bam file attached.
Command used:
Examples of variants with negative NM:
test_case.zip
The text was updated successfully, but these errors were encountered: