-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
combine multiple donor vcf #33
Comments
Hi, Thanks for the question. In your case, as you have the genotype for each donor, you can specify it in the vireo command line by adding Yuanhua |
Hi Yuanhua, thanks for your response. Thanks very much. |
Thanks for clarifying. It sounds fine to genotype donors by using scRNA, and you can still use multiple scRNA bam files as bulk samples to get a combined donor VCF file. Your second question might be more worrying, as the covered SNPs in scATAC may not be sufficient if the sequencing coverage is not enough. Usually, people find the scACAT coverage is okay for demultiplexing. In your case, maybe you could perform the de novo deconvolution without donor VCF, so that you can use most SNPs in your scATAC, then compare it back to scRNA based donors, with the notebook mentioned above. Y |
Thanks Yuanhua, Could you also please suggest to me how to subset the 1000genome vcf file you provide to remove chrX from it. I tried using vcftools --not-chr (not sure if this worked well) Thanks very much. |
Hi, I actually encountered another problem. I am trying to merge a few GT_donors.vireo.vcf.gz files for some analysis. For that, I tried using bcftools reheader to rename samples in this vcf (which works fine). However, bcftools merge needs indexed file, and bcftools index - it throws error. I tried bcftools sort to position-sort the vcf -which does not work either> How can I properly merge multiple GT_donors.vireo.vcf.gz files? |
Thanks for sharing this. Unfortunately, the potential incompatibility is still an outstanding issue and may need further tests in future releases. For the moment, you may try reheader it first and then try sort and/or index. YH |
Hello, The file eventually merges anyway somehow, however upon using vireoSNP.vcf.match_VCF_samples(), it gives this error: I tried sorting the merged file with bcftools sort. That gave me following error: Writing to /tmp/bcftools-sort.woMldC I think the file does not get created properly. Please help. Thanks. |
It looks the ref alleles are different from different samples. The ref allele from cellsnp-vireo pipeline should give the same as 1000genome. Not sure what ref allele in your other samples. Otherwise, you may consider |
Hello,
I have 2 pooled snATAC donors and their individual bulk RNA-seq data. To demultiplex, my strategy was to call common variant SNPs for the two individuals separately using CellSNP, followed by using their cellsnp.cells.vcf as donor-vcf to demultiplex pooled samples. However, I am unsure how to combine the two donor vcf files generated from cellsnp to provide them as --donor vcf files for vireo.
What Ive tried:
cellsnp-lite $DIR_snRNA_BAM/$value_*.bam -I $value -R genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf.gz --minMAF 0.1 --minCOUNT 100 -p 32 --cellTAG None --UMItag None --genotype --gzip --minMAPQ 30 --inclFLAG 2
bctools index Sample1.cells.vcf.gz
bcftools index Sample2.cells.vcf.gz
bcftools merge - Sample1.cells.vcf.gz Sample2.cells.vcf.gz -O combined.vcf -Oz.
Does this make sense?
Also, how can I match the demultiplexed donors to their original donor.vcf file?
And, If the information for only one of the donors (out of 2) is available - do you suggest using --forcelearnGT ?
Thanks
The text was updated successfully, but these errors were encountered: