Paired REad TEXTure Mapper. Converts SAM or pairs formatted read pairs into genome contact maps. See https://github.com/4dn-dcic/pairix/blob/master/pairs_format_specification.md for pairs format specification.
Pairs format supported by version 0.04 or later only.
PretextMap is a commandline tool for converting aligned read pairs in either the SAM/BAM/CRAM or pairs format into genomic contact maps (https://github.com/aidenlab/juicer/wiki/Pre, https://higlass.io/).
Data is read from stdin over a unix pipe, eliminating the need for any intermidiate files. Alignments can be read directly from an aligner ( | PretextMap), from a SAM file (PretextMap < file.sam), from a BAM/CRAM file using samtools (samtools view -h file.bam | PretextMap) or from a pairs file (PretextMap < file.pairs). PretextMap can even be inserted into the middle of existing pipelines by using tee or similar pipe-chaining tricks.
PretextMap comes with no imposed pipeline for processing data. Process your alignments however you want before feeding to PretextMap.
All commandline Pretext tools for Unix (Linux and Mac) are available on bioconda.
The full suite of Pretext tools can be installed with
> conda install pretext-suite
Or, just PretextMap can be installed with
> conda install pretextmap
Pipe SAM or pairs formatted read pairs to PretextMap e.g.:
samtools view -h file.bam | PretextMap
zcat file.paris.gz | PretextMap
Important: A SAM header with contig info must be present for SAM format (-h option for samtools).
Or pipe directly from an aligner e.g. bwa mem ... | PretextMap
- -o specifies an output file (required)
- --sortby sorts contigs by length, name or nosort (default: length)
- --sortorder ascend or descend (default: descend, no effect if sortby = nosort)
- --mapq sets a minimum mapping quality filter (default: 10)
example:
> samtools view -h file.bam | PretextMap -o map.pretext --sortby length --sortorder descent --mapq 10
- --filterInclude: a comma separated list of sequence names, only these sequences will be included
- --filterExclude: a comma separated list of sequence names, these sequence will be excluded
example:
> samtools view -h file.bam seq_1 seq_2 | PretextMap -o map.pretext --filterInclude "seq_1, seq_2"
Filtering will increase the map resolution, since you're mapping less sequence into a fixed number of bins.
Note: also filtering with samtools view as in the above example (... seq_1 seq_2) is not nessesary, but is recommended purely for speed (provided your bam file is sorted and indexed).
- --highRes: high resolution output, only supported by PretextView >=0.2.5
Contact maps are saved in a compressed texture format (hence the name). Maps can be read by PretextView (https://github.com/wtsi-hpag/PretextView). Expect pretext map files to take around 30 to 50 M of disk space each.
3G of RAM and 2 CPU cores
PretextMap uses the following third-party libraries:
Requires:
- clang >= 11.0.0
- meson >= 0.57.1
git submodule update --init --recursive
env CXX=clang meson setup --buildtype=release --unity on --prefix=<installation prefix> builddir
cd builddir
meson compile
meson test
meson install