Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancies in R2 and LD compared to bcftools prune #3

Open
Knight-JChev opened this issue Jun 26, 2024 · 3 comments
Open

Discrepancies in R2 and LD compared to bcftools prune #3

Knight-JChev opened this issue Jun 26, 2024 · 3 comments

Comments

@Knight-JChev
Copy link

Knight-JChev commented Jun 26, 2024

Dear all,

So I've tried using LDBlockShow on a QTL dataset to visualize the linkage blocks between biallelic SNPs.
I tested to observe the blocks before and after pruning to understand how the pruning took place and what it merged.

So I compared the R² and D' value obtained from bcftools +prune :
vcf_test.txt
vcf_test_annotated.txt

bcftools +prune -w 1 -a r2,LD vcf_test.vcf -O v -o vcf_test_annotated.vcf

and from LDBlockShow :
LDBlockShow -InVCF vcf_test.vcf -OutPut test_noprun_2Mb -SeleVar 2 -Region Chr01:1:2000000
ShowLDSVG -InPreFix test_noprun_2Mb -OutPut test_noprun_2Mb_Rsq -ShowNum

Looking at the results I found discrepancies in the value of R2 and D' at some (not all) positions between LDBlockShow and bcftools prune annotation. Scores are shown for one marker with the previous one after annotation in bcftools prune.

I highlighted in blue the place that seemed weird to me :

bcftools_prune_example

LDBlockShow_example

It does kind of the same at the same locations with D'.
Could you explain me what happens there ? Is that supposed to occur ? I read that the computations are the same between LDBlockShow and other tools, which seems to be the case here but I wonder why there are such differences at those places.

PS : Also LDBlockShow does not show the blocks with a black outline compared to what's in the documentation.

Kind regards,
Knight

@hewm2008
Copy link
Owner

1 Please make sure that there is no problem with the program installation, especially the plink file in bin should be placed in the same directory as LDBlockShow
2 No error is reported during operation running
3 The following command can produce normal images
4 Compare with other software. I don’t know how you compare. I think there is no problem with my program.

LDBlockShow	-InVCF	vcf_test.txt	-OutPut	test_noprun_2Mb	-SeleVar	2	-Region	Chr01:1:2000000	-NoShowLDist	100000000
 zcat   test_noprun_2Mb.site.gz |  awk '{print $0, $2}' >  Site.info
ShowLDSVG -InPreFix test_noprun_2Mb -OutPut test_noprun_2Mb_Rsq -ShowNum  -NoShowLDist   100000000  -SpeSNPName   Site.info

test_noprun_2Mb_Rsq

@Knight-JChev
Copy link
Author

Knight-JChev commented Jun 28, 2024

First of all, thanks for your answer.

  1. That is the case, LDBlockShow, plink and ShowLDSVG script files are in the same bin directory.
  2. This might be the case, I'm getting a segmentation fault error which relates in plink not being able to make a specific .blocks.det file I believe. Still the program is able to make the map, even though I suppose this is the reason why the black outline is not showing.

##Start Region Cal... :Chr01 1 2000000; In This Region TotalSNP Number is 15
find blocks...
Segmentation fault (core dumped)
mv: cannot stat '/mnt/f/QTL/Variant_calling_pipeline/results/results/vcf/pruned/test_noprun_2Mb_return.plink.blocks.det': No such file or directory
gzip: /mnt/f/QTL/Variant_calling_pipeline/results/results/vcf/pruned/test_noprun_2Mb_return.blocks: No such file or directory
Warining: SVG module in Perl is missing, trying to loading the built-in [SVG.pm]...
Loading SVG module done
Start draw... SVG info: SNPNumber :15 , SVG (width,height) = (862.5,637.5)
ALL done

  1. Indeed it worked except for the black outline and it gives me the same map as I sent in the previous message
    image

  2. I will definitely try to compare with other software but I was inclined to believe bcftools informations. The command line related to bcftools +prune (see previous message) writes R2 and D' score for a variant compared to another. See the info field with POS_R2 being the variant compared to the variant of the line (which should always be the previous variant) and R2 being the R2 score between those two variants. E.g of the line highlighted in blue : variant at position 272 661 is being compared to variant at position 272 658, the R2 score resulting of this comparison is 1; as shown in the graph made with LDBlockShow. The issue to me is the R2 between 164 353 and 164 773 : bcftools output tells me it is 0.95 whereas LDBlockShow tells me it is 0.03. When I take a look at the vcf file I'm rather inclined to believe bcftools.
    image

Warm regards,
Knight

@hewm2008
Copy link
Owner

1 should be the plink 1.9 ,you can rm the other plinks ,only keep the LDBlockshow/plinks
Segmentation fault (core dumped)
mv: cannot stat '/mnt/f/QTL/Variant_calling_pipeline/results/results/vcf/pruned/test_noprun_2Mb_return.plink.blocks.det': No such file or directory
gzip: /mnt/f/QTL/Variant_calling_pipeline/results/results/vcf/pruned/test_noprun_2Mb_return.blocks: No such file or directory

2 you can try phase the vcf first. try the phase vcf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants