-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WGSA annotation for noncoding variants in WGS studies #2587
Comments
Sergey; |
Thanks Brad, |
Sergey; |
Thanks Brad! I have not noticed that GERP conserved elements are already in gemini bundle. Now I see! This works perfectly well for me, as I'm still on grch37 and standalone bcbio installation. For grch38 I can only propose to use phastcons20way, phylop20way scores from UCSC browser: Sergey |
Thanks, @naumenko-sa do you think this is something that would be useful still? |
Hello, bcbio community!
Thanks for the great framework!
Gnomad_genome frequencies help to prioritize variants in WGS studies.
However, it would be nice to have functional prediction and conservation scores for noncoding variants.
For now, many scores come from dbNSFP, but, by definition, this database is for nonsynonymous (i.e. coding + not synonymous) variants and splice sites variants only (it contains 83,189,732 records).
https://sites.google.com/site/jpopgen/dbNSFP
For non coding variants the same group proposes to use WGSA:
https://sites.google.com/site/jpopgen/wgsa
"For SNV-centric resources, WGSA integrated 12 sets of functional prediction scores (CADD, FATHMM-MKL, FATHMM-XF, Funseq, Funseq2, RegulomeDB, DANN, fitCons x 4, GenoCanyon, Eigen & Eigen-PC, GenoSkyline-Plus x 127, LINSIGHT), 9 conservation scores (bStatistic, GERP++, PhyloP x 3, phastCons x 3, SyPhy), allele frequencies from 5 large-scale re-sequencing studies (1000G, EP6500, ExAC, UK10K, gnomAD), variants in 4 disease related databases (ClinVar, COSMIC, GWAS_catalog, GRASP2), among others (see list of resources)."
Are there any plans to introduce WGSA to bcbio? The dataset is so huge (1.4T, which is 2-3 times more than most bcbio installations with human/mouse genomes), that, probably, the local installation of WGSA is not an option. But what about accessing through Amazon Web Service? Does it look like something feasible (https://sites.google.com/site/jpopgen/wgsa/using-wgsa-via-aws)?
Thanks!
Sergey
The text was updated successfully, but these errors were encountered: