fix indel inference in ancestral reconstructions #131

metasoarous · 2017-01-27T01:56:49Z

DNAML assumes that any insertions in the tip alignment started at the root. And since the naive is technically not the root but a tip, it's inferred from our lineage alignments (where we manually put the naive at the root) that an insertion happened shortly after the naive sequence (the root), and then disappeared right most of the other tips (all but the tip with the actual insertion). This threw @lauranoges for a loop, and is likely to do the same with others as well.

I think we can fix this pretty simply: If we look at each gap in the naive sequence, we can see which tip sequence don't share that gap (and there must be some, or they would have been filtered out at this point), and place gaps in all internal node sequences except those non-tip sequences decending from the mrca of the nodes with the insertions. This would be easy enough to code up and generally solve the problem. Thoughts @matsen?

matsen · 2017-01-28T01:33:43Z

Not totally agreed. I agree that it's an issue, but the problem is hard.

I think that the easiest way forward is to get a result is to use gap coding, which perhaps might work well? Here is a paper which appears to do something related but more sophisticated.

matsen · 2017-03-26T22:14:53Z

I think we should move to PRANK.

PRANK has smart ways of dealing with fast and slow rates, and treats indels in a phylogenetic fashion (https://paperpile.com/shared/dSsCir)
PRANK can codon align (should be better than protein align and backtrans align)
PRANK can do ancestral sequence reconstruction where indels are treated "properly" (heuristically, but a lot better than treating gaps as missing data)

@metasoarous sorry, but I'm going to hand this off to you to try this out!

metasoarous · 2017-03-31T22:12:12Z

Sorry!? Hah! Begone demon Phylip!

I'm on it.

matsen · 2017-03-31T22:16:37Z

Er, sorry, but PRANK isn't going to infer the trees for you...

Now that I'm thinking about it, @cswarth had some funny experiences with the PRANK ancestral sequence reconstruction. Is that right, Chris?

cswarth · 2017-03-31T22:31:11Z

We used prank to infer ancestral sequences and trees for PREAST,

https://github.com/matsengrp/PREAST/blob/master/bin/infer.sh#L157

I don't recall the specifics of how it came up with a tree.

metasoarous · 2017-03-31T22:46:32Z

Thanks @cswarth

@matsen Looks like it can infer and spit out its own guide trees via the -showtree command, but perhaps they're not really to be trusted. Then again, in our situation, maybe they're just as trustworthy as anything else we're looking at.

In any case, if we supply our own trees, we can use PRANK just for the ancestral construction and for cleaning up the alignment (obviously we'd already have to have an alignment for producing the input tree), and this would free us up to choose something saner for the tree construction, yes?

metasoarous · 2017-04-25T23:38:34Z

@matsen So how should we do this? If we want to use the ancestral sequences, we need to already have the final tree, so that the ancestral seqs correspond to that topology. But then what do we use for the alignment going into that tree? Do you think it's fine to just subset the big muscle alignment, and feed that into dnaml? Or is it worth taking those sequences and aligning them with a preliminary run of PRANK first, to get a better final tree? (And then do a second round of PRANK after to get the ancestral sequences?)

matsen · 2017-04-25T23:48:39Z

We could do either strategy. IIUC the alignment problem isn't especially hard, right? The challenge here is to get ancestral sequences on the tree in the presence of indels.

metasoarous · 2017-05-03T20:55:41Z

Ug... well, here's some sour apples. Prank complains with the like of Problem with the guidetee: brackets (79,79) and commas (210) don't match) when we pass in a parsimony tree. Presumably, because we have multifurcations :-/ I guess I can translate the multifurcations into a series of bifurcations, and then infer back? I guess we'll see how our ml/parsimony battle pans out. Maybe we won't need to worry about this. For now I'll just try and get things working with ml.

matsen · 2017-05-03T23:38:42Z

👍 for continuing with ML.

metasoarous · 2017-07-26T18:44:54Z

As discussed elsewhere, PRANK's ancestral state reconstruction appears to be a joke (harhar...). As @krdav and I thoroughly demonstrated to ourselves, the internal node sequences are all mismatched from the tree. @krdav has opened an issue for this here: ariloytynoja/prank-msa#16.

For now, I'm going to put this issue on Ice, in case they fix things or we find another way around this issue. I will however take it off the MB release milestone.

PS Thanks again for all your work on this @krdav!

matsen self-assigned this Feb 14, 2017

metasoarous mentioned this issue Mar 9, 2017

alternatives to dnapars #149

Closed

matsen assigned metasoarous and unassigned matsen Mar 26, 2017

metasoarous mentioned this issue Apr 26, 2017

replace default tree building program (dnaml) with raxml-ng #170

Closed

matsen modified the milestone: MB release May 1, 2017

metasoarous removed this from the MB release milestone Jul 26, 2017

metasoarous mentioned this issue Jul 26, 2017

Tree branch support #130

Closed

metasoarous removed their assignment Oct 21, 2018

metasoarous mentioned this issue Apr 12, 2019

Ancestral sequence reconstruction #7

Closed

matsen closed this as completed May 29, 2019

lauradoepker mentioned this issue May 29, 2019

Indel handling in QA255.157-Vk #277

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix indel inference in ancestral reconstructions #131

fix indel inference in ancestral reconstructions #131

metasoarous commented Jan 27, 2017

matsen commented Jan 28, 2017

matsen commented Mar 26, 2017

metasoarous commented Mar 31, 2017

matsen commented Mar 31, 2017

cswarth commented Mar 31, 2017

metasoarous commented Mar 31, 2017

metasoarous commented Apr 25, 2017

matsen commented Apr 25, 2017

metasoarous commented May 3, 2017

matsen commented May 3, 2017

metasoarous commented Jul 26, 2017 •

edited

Loading

fix indel inference in ancestral reconstructions #131

fix indel inference in ancestral reconstructions #131

Comments

metasoarous commented Jan 27, 2017

matsen commented Jan 28, 2017

matsen commented Mar 26, 2017

metasoarous commented Mar 31, 2017

matsen commented Mar 31, 2017

cswarth commented Mar 31, 2017

metasoarous commented Mar 31, 2017

metasoarous commented Apr 25, 2017

matsen commented Apr 25, 2017

metasoarous commented May 3, 2017

matsen commented May 3, 2017

metasoarous commented Jul 26, 2017 • edited Loading

metasoarous commented Jul 26, 2017 •

edited

Loading