-m and -M options don't make a difference #82

alj1983 · 2018-03-22T15:54:04Z

I have mapped reads with the following commands:

bowtie --phred33-quals -q -n 2 -l 10 --best --strata -y -S -k 1 -m 1 --al aligned.fastq -un unaligned.fastq Reference Query.fq > Mapping.sam
bowtie --phred33-quals -q -n 2 -l 10 --best --strata -y -S -k 1 -M 1 --al aligned.fastq -un unaligned.fastq Reference Query.fq > Mapping.sam

The first command should report only unique mappings while the second command should report also reads that mapped at different locations (but only one of these locations).

However, If I count the number of reads that aligned in the aligned.fastq files, I get the exact same result for both mappings. What is here wrong?

…ified #82

ch4rr0 · 2019-05-29T21:58:38Z

We pushed a fix for this issue which should correct the behavior of -M as well as cause bowtie to report reads that were sampled or suppressed (-m). E.g. :

$ ./bowtie indexes/e_coli reads/e_coli_1000.fq -M1 --best -S > /dev/null
# reads processed: 1022
# reads with at least one reported alignment: 699 (68.40%)
# reads that failed to align: 301 (29.45%)
# reads with alignments sampled due to -M: 22 (2.15%)
Reported 699 alignments

$ ./bowtie indexes/e_coli reads/e_coli_1000.fq -m1 --best -S > /dev/null
# reads processed: 1000
# reads with at least one reported alignment: 677 (67.70%)
# reads that failed to align: 301 (30.10%)
# reads with alignments suppressed due to -m: 22 (2.20%)
Reported 677 alignments

If you are able to build bowtie from source, please let me know if this commit fixes this issue for you as well.

ch4rr0 · 2019-07-06T01:32:45Z

This change has been included in our latest version.

karaulanov · 2019-08-28T13:09:56Z

We pushed a fix for this issue which should correct the behavior of -M as well as cause bowtie to report reads that were sampled or suppressed (-m). E.g. :
$ ./bowtie indexes/e_coli reads/e_coli_1000.fq -M1 --best -S > /dev/null
# reads processed: 1022
# reads with at least one reported alignment: 699 (68.40%)
# reads that failed to align: 301 (29.45%)
# reads with alignments sampled due to -M: 22 (2.15%)
Reported 699 alignments

$ ./bowtie indexes/e_coli reads/e_coli_1000.fq -m1 --best -S > /dev/null
# reads processed: 1000
# reads with at least one reported alignment: 677 (67.70%)
# reads that failed to align: 301 (30.10%)
# reads with alignments suppressed due to -m: 22 (2.20%)
Reported 677 alignments
If you are able to build bowtie from source, please let me know if this commit fixes this issue for you as well.

It seems like the new Bowtie release (1.2.3) incorrectly reports the number of "reads processed" in the '-M 1' mode by counting twice the multi-mapping reads (in your example 1000 original reads become reported as 1022 reads processed) and hence all % estimates are also changing. Apart from that reporting bug (or feature), I don't find apparent differences in the actual alignments produced by Bowtie_1.2.3 versus Bowtie_1.2.2 using '-M 1' mode.

ch4rr0 · 2019-08-28T18:05:53Z

Thank you for picking up on this. It helped me realize that I never committed my complete changes for this issue. With all the changes in place here are the new outputs from the above commands:

$ ./bowtie-align-s indexes/e_coli reads/e_coli_1000.fq -M1 --best --sam-nohead  -S | wc -l
# reads processed: 1000
# reads with at least one reported alignment: 699 (69.90%)
# reads that failed to align: 301 (30.10%)
# reads with alignments sampled due to -M: 22 (2.20%)
Reported 699 alignments
    1000

 $ ./bowtie-align-s indexes/e_coli reads/e_coli_1000.fq -m1 --best --sam-nohead  -S | wc -l
# reads processed: 978
# reads with at least one reported alignment: 677 (69.22%)
# reads that failed to align: 301 (30.78%)
# reads with alignments suppressed due to -m: 22 (2.25%)
Reported 677 alignments
     978

I think this should be inline with your expectations.

[EDIT: updated output with the results of including --sam-nohead and -S flags]

ch4rr0 · 2019-08-28T18:13:01Z

There's still one bug that needs fixing: the reads processed for -m1 should still be 1000. I will look into a fix for that issue.

ch4rr0 · 2019-08-28T20:28:08Z

Ok I fixed the issue and changed the summary a little bit so that it, in my opinion, makes more sense.

$ ./bowtie-align-s indexes/e_coli reads/e_coli_1000.fq -m1 --best --sam-nohead  -S --threads 3 | wc -l
# reads processed: 1000
# reads with at least one alignment: 699 (69.90%)
# reads that failed to align: 301 (30.10%)
# reads with alignments suppressed due to -m: 22 (2.20%)
Reported 677 alignments
     978

$ ./bowtie-align-s indexes/e_coli reads/e_coli_1000.fq -M1 --best --sam-nohead  -S --threads 3 | wc -l
# reads processed: 1000
# reads with at least one alignment: 699 (69.90%)
# reads that failed to align: 301 (30.10%)
# reads with alignments sampled due to -M: 22 (2.20%)
Reported 699 alignments
    1000

Let me know your thoughts on this.

karaulanov · 2019-08-30T11:32:51Z

Thanks a lot for the quick feedback. The modified reporting looks good to me, except that the suppressed alignments are lost from the SAM file, which may cause problems in some cases. Ideally all input reads should be reported in the SAM file by default.

On a different note, probably not worth opening a special issue, I noticed that while older Bowtie versions delete all non-ACGTN bases before sequence alignment, since version 1.2.2 such bases are all converted to "A" during alignment and appear as "A" in the SAM files. This becomes relevant when, for example, one uses Bowtie "off label" to align miRNA datasets from miRBase (containing Us instead of Ts) to newly assembled genomes, resulting in spurious and misleading alignments of modified sequences. It would be good having some documentation on that issue and maybe also implement different rules, e.g. conversion of Us into Ts and other atypical bases into Ns (instead of As) plus giving some warning messages to make people aware of the modifications.

ch4rr0 added a commit that referenced this issue May 29, 2019

Fix issue preventing bowtie from reporting alignments when -M is spec…

53e562d

…ified #82

ch4rr0 added a commit that referenced this issue Aug 28, 2019

Complete changeset for issue #82

af0227f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

-m and -M options don't make a difference #82

-m and -M options don't make a difference #82

alj1983 commented Mar 22, 2018

ch4rr0 commented May 29, 2019

ch4rr0 commented Jul 6, 2019

karaulanov commented Aug 28, 2019

ch4rr0 commented Aug 28, 2019 •

edited

Loading

ch4rr0 commented Aug 28, 2019

ch4rr0 commented Aug 28, 2019

karaulanov commented Aug 30, 2019

-m and -M options don't make a difference #82

-m and -M options don't make a difference #82

Comments

alj1983 commented Mar 22, 2018

ch4rr0 commented May 29, 2019

ch4rr0 commented Jul 6, 2019

karaulanov commented Aug 28, 2019

ch4rr0 commented Aug 28, 2019 • edited Loading

ch4rr0 commented Aug 28, 2019

ch4rr0 commented Aug 28, 2019

karaulanov commented Aug 30, 2019

ch4rr0 commented Aug 28, 2019 •

edited

Loading