-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allele frequency cant be flipped for multi-allelic variants. #165
Comments
Hey!
I completely agree, this is something we are actively investigating in the lab with regards to the number of non-bi-allelic SNPs across different dbSNP builds. For example, see this issue. I do believe we will be heading towards keeping non-bi-allelic SNPs as the default but this requires checking what effect this will have, for example on commonly used downstream analysis tools - currently these mostly expect bi-allelic SNPs only.
My advice to sensibly deal with this is to set I'm open to suggestions if you think MSS could be modified to better deal with your issue, just let me know. Alan. |
Hi @Al-Murphy thank you for the prompt reply. I agree that the problem of flipping allele frequencies for multi-allelics is not trivial. Three comments:
In this context, I believe that an easy solution for mungesumstats would be to add an option best, |
Hey,
Yes I agree with this approach as long as it isn't set as the default. I have updated MSS to incorporate this (v1.9.16) which you can test and let me know if it works as intended for you? Cheers, |
Closing because of inactivity. Reopen if the issue isn't resolved for you. Alan. |
Hi,
when running mungesumstats v1.9.6, I got the following Error:
Error in check_allele_flip(sumstats_dt = sumstats_return$sumstats_dt, : Certain SNPs need to be flipped along with their effect columns and frequency column. However to flip the FRQ column, only bi-allelic SNPs can be considered. It is recommended to set bi_allelic_filter to TRUE so non-bi-allelic SNPs are removed. Otherwise, set allele_flip_frq to FALSE to not flip the FRQ column but note this could lead to incorrect FRQ values.
With ever increasing sample sizes, the majority of positions in the genome will have (rare) multi-allelic variants. It is theoretically possible that with large enough sample sizes, all possible mutations for every single position in the genome will be detected and added to dbSNP. Thus, only keeping bi-allelics (as defined through dbSNP) is not really a viable option: more than half of any dataset -eventually the entirety- will be eliminated by such a filter.
In this context, I would like to know how the above error can be sensibly bypassed while flipping columns. Is the solution to do this procedure manually?
will be glad for any info or feedback on the above issue/error.
best,
Thanos
The text was updated successfully, but these errors were encountered: