4.2.6.0
Download release: gatk-4.2.6.0.zip
Docker image: https://hub.docker.com/r/broadinstitute/gatk/
Highlights of the 4.2.6.0 release:
-
Important bug fixes for the joint calling tools (GenotypeGVCFs / GenomicsDB)
- GATK 4.2.5.0 contained two joint genotyping bugs that are now fixed in GATK 4.2.6.0:
GenotypeGVCFs
can throw NullPointerExceptions in some cases with many alternate alleles.- The expectation-maximization component of the QUAL calculation was disabled, leading to false positive, low quality alleles at some multi-allelic sites.
- If you are running these tools in 4.2.5.0 we strongly recommend updating to 4.2.6.0
- GATK 4.2.5.0 contained two joint genotyping bugs that are now fixed in GATK 4.2.6.0:
-
Fixed a "Bucket is a requester pays bucket but no user project provided" error that occurred when accessing requester pays buckets in Google Cloud Storage even when the
--gcs-project-for-requester-pays
argument was specified- If you continue to encounter problems accessing requester pays Google Cloud Storage buckets in 4.2.6.0, please let us know by filing a Github issue!
-
Two new tools for the Structural Variation calling pipeline:
SVAnnotate
andPrintSVEvidence
-
Some fixes to genotype-given-alleles mode in
HaplotypeCaller
andMutect2
Full list of changes:
-
Joint Calling (GenotypeGVCFs / GenomicsDB)
- GATK 4.2.5.0 contained two joint genotyping bugs which are now fixed in 4.2.6.0:
GenotypeGVCFs
can throw NullPointerExceptions in some cases with many alternate alleles.- Fixed in:
- Fix for
NullPointerException
when GenomicsDB has more ALT alleles than specified maximum and many GQ0 hom-ref genotypes allow variants to pass the QUAL filter (#7738)
- Fix for
- Fixed in:
- The expectation-maximization component of the QUAL calculation was disabled, leading to false positive, low quality alleles at some multi-allelic sites.
- Fixed in:
- Fix multi-allelic QUAL calculation and restore some missing ALT annotation data in
ReblockGVCFs
(#7670)
- Fix multi-allelic QUAL calculation and restore some missing ALT annotation data in
- Fixed in:
- Mention acceptable compressed VCF file extensions in
GenomicsDBImport
error message (#7692)
- GATK 4.2.5.0 contained two joint genotyping bugs which are now fixed in 4.2.6.0:
-
SV Calling
- Added a new tool
SVAnnotate
(#7431)SVAnnotate
adds functional annotations for SVs called byGATK-SV
(#7431)
- Added a new tool
PrintSVEvidence
(#7695)PrintSVEvidence
is a tool that can merge any number of files containing one of five types of evidence of structural variation. It's also capable of subsetting regions or samples. It's used to merge evidence from a cohort in theGATK-SV
pipeline.
- Added start/end coordinate validation to
SVCallRecord
(#7714)
- Added a new tool
-
HaplotypeCaller / Mutect2
- Fixed an edge case in
HaplotypeCaller
where filtered alleles in the vicinity of forced-calling alleles could result in empty calls (#7740)- This affects users who run genotype given alleles mode in non-GVCF mode
- Fixed a bug in
HaplotypeCaller
andMutect2
where force-calling alleles were lost upon trimming by placing allele injection after trimming (#7679) - Added a debug ``--pair-hmm-results-file` argument that dumps the the exact inputs/outputs of the PairHMM to a file (#7660)
- Some changes to
Mutect2
to support the futureMutect3
(#7663)- Added training data for the Mutect3 normal artifact filter
- Output tensors for Mutect3 as plain text rather than VCF
- Fixed an edge case in
-
RNA Tools
TransferReadTags
: a new tool that transfers a read tag from an unaligned bam to the matching aligned bam (#7739).- This tool allows us to retrieve read tags that get lost when converting a SAM file to fastqs, then back to SAM (which is necessary if e.g. running fastp to clip adapter bases before alignment).
PostProcessReadsForRSEM
: a new tool that re-orders and filters reads before running RSEM, which has stringent requirements on the input SAM (https://github.com/deweylab/RSEM) (#7752).
-
Funcotator
- Added custom
VariantClassification
severity ordering. (#7673)- Users can now customize the severity ratings of the various
VariantClassifications
using the new--custom-variant-classification-order
argument
- Users can now customize the severity ratings of the various
- Added logging statements to the b37 conversion process explaining why the automatic b37 conversion does or does not take place on their VCFs (#7760)
- Added custom
-
VariantRecalibrator
- Added regularization to covariance in GMM maximization step to fix convergence issues in
VariantRecalibrator
(#7709)- This makes the tool more robust in cases where annotations are highly correlated
- Added regularization to covariance in GMM maximization step to fix convergence issues in
-
Bug Fixes
- Fixed a "Bucket is a requester pays bucket but no user project provided" error that occurred when accessing requester pays buckets in Google Cloud Storage even when
--gcs-project-for-requester-pays
was specified (#7700) (#7730) - Fix for the
PossibleDeNovo
annotation to work without Genotype Likelihoods (#7662)PossibleDeNovo
checks each trio's genotype (including parent hom ref genotypes) for likelihoods even though it doesn't actually use the PLs. The PLs can get dropped if GVCFs are reblocked which means this annotation no longer works as expected. This changes the check to look for GQs instead of PLs as the GQs are used as part of the annotation.
- Fixed a bug with the
--mate-too-distant-length
inMateDistantReadFilter
not being configurable (#7701)
- Fixed a "Bucket is a requester pays bucket but no user project provided" error that occurred when accessing requester pays buckets in Google Cloud Storage even when
-
GATK Engine
-
Miscellaneous Changes
- Added back the
jcenter
repository resolver to our gradle build, fixing a "Could not find biz.k11i:xgboost-predictor:0.3.0" error when building GATK from source (#7665) - We now properly update the
latest
tag in thebroadinstitute/gatk-nightly
Dockerhub repo (#7703) - The docker build now only does a
git lfs pull
onsrc/main/resources/large
(#7727) - Install git lfs with --force in the
Dockerfile
(#7682) - Fix WDL generation for
MultiVariantWalkers
by adding a companion index to theMultiVariantWalker
input variant arg (#7689) - Added google apps script to automatically update GATK release stats. (#7637)
- Updated the GATK stats script to be more universally usable (#7759)
- Added
JointCallExomeCNVs
to.dockstore.yml
and included a note in the WDL (#7719)
- Added back the
-
Documentation
- Corrected the docs for the
--heterozygosity
argument in theGenotypeCalculationArgumentCollection
(#7661)
- Corrected the docs for the
-
Dependencies