Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide User Guide to Common Warnings #217

Closed
DarioS opened this issue Jun 2, 2019 · 4 comments
Closed

Provide User Guide to Common Warnings #217

DarioS opened this issue Jun 2, 2019 · 4 comments

Comments

@DarioS
Copy link

DarioS commented Jun 2, 2019

Could there be a guide written about the meanings of common GRIDSS log warnings and what to do about them? For example, does

WARNING	2019-06-01 20:34:27	SAMRecordUtil	SA tag of read A00121:72:HG2WFDSXX:3:1125:21215:36980 refers to missing alignments [chr9,91714567,+,60S18I72M,36,18]

mean that one needs to start from the beginning and align the FASTQ files differently?

I also think that the next two warnings look contradictory:

WARNING	2019-06-02 04:12:00	SortingCollection	There is not enough memory per file for buffering. Reading will be unbuffered.
WARNING	2019-06-02 04:13:54	AsyncBufferedIterator	Thread interrupt received whilst closing underlying iterator

If the reading is unbuffered, why is there a warning soon after from AsyncBufferedIterator?

It would be convenient if there was an option that caused log messages to be output once at the beginning of a module and once at the end (less verbose). It's hard to spot problems with many messages of the form:

INFO	2019-06-01 19:43:40	SinglePassSamProgram	Processed   845,000,000 records.  Elapsed time: 01:36:19s.  Time for last 1,000,000:    9s.  Last read position: chr10:44,296,406

cluttering the log.

@d-cameron
Copy link
Member

mean that one needs to start from the beginning and align the FASTQ files differently?

It usually means you performed GATK indel realignment. GATK indel realignment doesn't update the SA tags of split read alignments so the records for the split read alignment become internally inconsistent, hence the warning.

If the reading is unbuffered, why is there a warning soon after from AsyncBufferedIterator?

The sorting step is unbuffered. The two messages apply to difference steps (the log messages can get even more confusing when multiple threads are interleaving their log messages.

It's hard to spot problems with many messages cluttering the log.

This one's a bit messier as I'm reuse the htsjdk ProgressLogger and it doesn't look like it supports conditionally logging just the progress indicators. I'd have add a option then go through every progress logger and add conditional logging code to keep these out. Even then, I wouldn't be able to stop the progress logging from the places I call out to picard code (e.g. CollectGridssMetrics just adds extra metrics to Picard tools CollectMultipleMetrics).

Leaving this issue open so I can put these in FAQ/wiki.

@DarioS
Copy link
Author

DarioS commented Jun 7, 2019

I didn't use GATK to process the files but I merely piped the output of bwa mem into samtools' view and sort. Then, the BAM file was simply input to GRIDSS. I picked one of the warnings and used grep on the BAM file in the GRIDSS working directory (not the user input BAM file) to get all of the relevant lines. I can't identify what the issue is because both of the SA tags do have the alignment described present on another line.

A00121:72:HG2WFDSXX:3:1126:21567:10222	393	chr2	32916296	0	1S40M109S	=	32916296	0	TGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGTGGGGGGGAGCTGTGGATTGATTGGTCGTTTGTATTGTATGTGAAGGAAATGAAAACATATAATAAATAAAATAAAATAAATAAAAAAATTTTTATGAAATAACTACTAAAAAAAAAAA	,,FF:,,,,,FF,,:::::,,,,,,,:,,,:,,FF,:,,,,,,F::F,,,,,,,,,,,,,,,,,,,,,,:,,,,,,,,,,,:,:,,,,,,,,,,,,,,:,,,:,FF,,,:,F,,,,::F:,,,,,,,,,,,,,:,,F,,,,,,:,,,,,,	Q2:Z:FFFFFFFFFFFFFFFFFFFF:FFFFFFFF:FFFFF,:FFFFF:FFFFFFF::,FF::F,,FF:,F,,,FFF,F,FF:F:,FF:,FF,,FFF,:,F,FF::F,,F:::,F,,,,:,F,,,,,,,,,,,,,,,,,,,,,,,:,,FFF:FFF:	R2:Z:ACGGCGACCACCGAGATCTACACCTACAGTTACACTCTTTCCCTACACGACGCTCTTCCGATCGATCGGAAGAGCACACGTCTTAACTCCAGTCACAATCCGGAATCTCGTGTGGCGGCGGCTTGGTGGGAACTGGGGGGGGGGGGGGGG	SA:Z:chr20,35113547,+,78S12M2D29M1D25M1D6M,4,15;	MD:Z:30G9	NM:i:2	AS:i:37	XS:i:36
A00121:72:HG2WFDSXX:3:1126:21567:10222	69	chr20	35113547	0	*	=	35113547	0	ACGGCGACCACCGAGATCTACACCTACAGTTACACTCTTTCCCTACACGACGCTCTTCCGATCGATCGGAAGAGCACACGTCTTAACTCCAGTCACAATCCGGAATCTCGTGTGGCGGCGGCTTGGTGGGAACTGGGGGGGGGGGGGGGG	FFFFFFFFFFFFFFFFFFFF:FFFFFFFF:FFFFF,:FFFFF:FFFFFFF::,FF::F,,FF:,F,,,FFF,F,FF:F:,FF:,FF,,FFF,:,F,FF::F,,F:::,F,,,,:,F,,,,,,,,,,,,,,,,,,,,,,,:,,FFF:FFF:	Q2:Z:,,FF:,,,,,FF,,:::::,,,,,,,:,,,:,,FF,:,,,,,,F::F,,,,,,,,,,,,,,,,,,,,,,:,,,,,,,,,,,:,:,,,,,,,,,,,,,,:,,,:,FF,,,:,F,,,,::F:,,,,,,,,,,,,,:,,F,,,,,,:,,,,,,	R2:Z:TGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGTGGGGGGGAGCTGTGGATTGATTGGTCGTTTGTATTGTATGTGAAGGAAATGAAAACATATAATAAATAAAATAAAATAAATAAAAAAATTTTTATGAAATAACTACTAAAAAAAAAAA	MC:Z:78S12M2D29M1D25M1D6M	MQ:i:4	AS:i:0	XS:i:0
A00121:72:HG2WFDSXX:3:1126:21567:10222	137	chr20	35113547	4	78S12M2D29M1D25M1D6M	=	35113547	0	TGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGTGGGGGGGAGCTGTGGATTGATTGGTCGTTTGTATTGTATGTGAAGGAAATGAAAACATATAATAAATAAAATAAAATAAATAAAAAAATTTTTATGAAATAACTACTAAAAAAAAAAA	,,FF:,,,,,FF,,:::::,,,,,,,:,,,:,,FF,:,,,,,,F::F,,,,,,,,,,,,,,,,,,,,,,:,,,,,,,,,,,:,:,,,,,,,,,,,,,,:,,,:,FF,,,:,F,,,,::F:,,,,,,,,,,,,,:,,F,,,,,,:,,,,,,	Q2:Z:FFFFFFFFFFFFFFFFFFFF:FFFFFFFF:FFFFF,:FFFFF:FFFFFFF::,FF::F,,FF:,F,,,FFF,F,FF:F:,FF:,FF,,FFF,:,F,FF::F,,F:::,F,,,,:,F,,,,,,,,,,,,,,,,,,,,,,,:,,FFF:FFF:	R2:Z:ACGGCGACCACCGAGATCTACACCTACAGTTACACTCTTTCCCTACACGACGCTCTTCCGATCGATCGGAAGAGCACACGTCTTAACTCCAGTCACAATCCGGAATCTCGTGTGGCGGCGGCTTGGTGGGAACTGGGGGGGGGGGGGGGG	SA:Z:chr2,32916296,+,1I40M109S,0,2;	XA:Z:chr4,+11503989,77S9M2I32M3I11M1I14M1D1M,16;chr11,+41717896,77S4M1D6M1I32M3I11M1I7M1D5M1D3M,16;chr3,+41378765,77S9M2I33M1D13M2I14M,17;chr8,-120189456,16S36M1I6M1D13M78S,9;chr13,-93031734,12M1I12M4I23M1D10M1I3M84S,14;	MD:Z:4A4A2^AA1A27^T1A0A0A1A1A0A6A2A6^T6	NM:i:15	AS:i:43	XS:i:39

For these, the warning message is

WARNING	2019-06-03 01:55:56	SAMRecordUtil	SA tag of read A00121:72:HG2WFDSXX:3:1126:21567:10222 refers to missing alignments [chr2,32916296,+,1I40M109S,0,2]

Do you observe anything problematic?

It seems that modifying the frequency of progress messages isn't feasible, so their current frequency is understandable.

d-cameron added a commit that referenced this issue Jul 15, 2019
…d warrnings should now all be gone when GRIDSS is run from the driver script.
@d-cameron
Copy link
Member

Keeping open as common errors (eg SA tag mismatches) are not yet properly documented.

@d-cameron
Copy link
Member

Closing since GRIDSS will now fix SA tag mismatches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants