Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark new GATK germline sub workflow for accuracy and precision #21

Closed
skchronicles opened this issue Jun 26, 2024 · 1 comment
Closed
Assignees

Comments

@skchronicles
Copy link
Contributor

No description provided.

@skchronicles skchronicles self-assigned this Jun 26, 2024
@skchronicles
Copy link
Contributor Author

Results: Sample HG002

Benchmarking command:

# Extract sample HG002 from the
# multi-sample VCF, hap.py does 
# not support multi-sample VCFs
# and our truthset is for HG002.
module purge
module load bcftools
bcftools view \
    -s HG002 \
    snps_and_indels_recal_refinement_variants.vcf.gz \
    -o HG002.snps_indels.haplotypecaller.recal.refined.vcf.gz \
    -Oz

# Run hap.py to evaluate the 
# performance of the new sub-
# workflow & HaplotypeCaller.
module purge
module load singularity
singularity run -B $PWD /path/to/hap.py_latest.sif \
    /opt/hap.py/bin/hap.py \
        --threads 12 \
        -o HG002_benchmarking_results \
        -r Homo_sapiens_assembly38.fasta \
        -f truthset/HG002_GRCh38_1_22_v4.1_draft_benchmark.bed \
       truthset/HG002_GRCh38_1_22_v4.1_draft_benchmark.vcf.gz \
       HG002.snps_indels.haplotypecaller.recal.refined.vcf.gz

Benchmarking summary from hap.py:

 Type Filter  TRUTH.TOTAL  TRUTH.TP  TRUTH.FN  QUERY.TOTAL  QUERY.FP  QUERY.UNK  FP.gt  METRIC.Recall  METRIC.Precision  METRIC.Frac_NA  METRIC.F1_Score  TRUTH.TOTAL.TiTv_ratio  QUERY.TOTAL.TiTv_ratio  TRUTH.TOTAL.het_hom_ratio  QUERY.TOTAL.het_hom_ratio
INDEL    ALL       526124    521041      5083       958334      5739     408419   1302       0.990339          0.989564        0.426176         0.989951                     NaN                     NaN                   1.528212                   2.024520
INDEL   PASS       526124    520564      5560       944090      4666     395693   1285       0.989432          0.991492        0.419126         0.990461                     NaN                     NaN                   1.528212                   1.986641
  SNP    ALL      3365379   3339562     25817      3979159     37213     600838   4292       0.992329          0.988985        0.150996         0.990654                2.099711                1.952641                   1.580978                   1.745183
  SNP   PASS      3365379   3233487    131892      3521165      6736     279410    996       0.960809          0.997922        0.079352         0.979014                2.099711                2.053189                   1.580978                   1.618381

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant