.crai index improvements #137

jkbonfield · 2016-03-15T16:36:53Z

This is just a list of things that could be improved, for whenever we next revise the format. (So we don't forget any). I'm not suggesting an immediate update, but to gather ideas in one place.

Magic number with version string.
Add number of reads / bases as columns. This will make very approximate coverage plots trivial as well as improve tools like samtools idxstats so they work on both BAM and CRAM. What else in idxstats needs replicating?
A generation UUID. If coupled with an identical UUID in the SAM header then we can use this to spot cases where the CRAM file has been updated without rebuilding the index. (We want to add this same feature to .BAI and .CSI too.)
Check the utility of container size column. I think currently it is the number of remaining bytes after decoding the container header (and perhaps compression header?). More useful for random slicing would simply by the size of the entire container.
Consider whether gzipped text is the right format. We could provide for random access on compressed index by self-indexing the index, but that's a far larger change.

droazen · 2016-03-24T19:03:54Z

@jkbonfield This proposal seems very relevant to the following crai-related bug report in htsjdk: samtools/htsjdk#531

jkbonfield · 2017-11-06T09:35:49Z

For completeness sake, so we don't forget at least, also consider adding the "missing" meta-information to CRAM indices. Re: pysam-developers/pysam#556.

I say "missing" because at the time of writing CRAM those extra fields in BAI were non-standard and undocumented anyway.

(NB: No planned time line of this.)

jkbonfield mentioned this issue Apr 12, 2016

CRAM v4 idea tracker #144

Open

dpryan79 mentioned this issue Nov 6, 2017

CRAM support deeptools/deepTools#619

Closed

cmnbroad mentioned this issue Nov 6, 2017

BAMIndexMetaData incorrect when index results from .crai->.bai conversion samtools/htsjdk#531

Open

daviesrob added the cram label Nov 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.crai index improvements #137

.crai index improvements #137

jkbonfield commented Mar 15, 2016

droazen commented Mar 24, 2016

jkbonfield commented Nov 6, 2017

.crai index improvements #137

.crai index improvements #137

Comments

jkbonfield commented Mar 15, 2016

droazen commented Mar 24, 2016

jkbonfield commented Nov 6, 2017