Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alternatives to dnapars #149

Closed
matsen opened this issue Mar 9, 2017 · 19 comments
Closed

alternatives to dnapars #149

matsen opened this issue Mar 9, 2017 · 19 comments

Comments

@matsen
Copy link
Contributor

matsen commented Mar 9, 2017

If we become frustrated with dnapars, here are two alternatives:

The cool thing about POY is that it can use indels as informative about the tree, and doesn't require a sequence alignment step. I couldn't get it to work in 30 mins.

PAUP* does ancestral sequence reconstruction according to the manual. It's an old standby.

@metasoarous
Copy link
Member

If we become frustrated?

From POY docs:

Phylogenetic Application written in OCaml and C

Awesome.

It would definitely be nice to get the indel inference features of POY for #131. Though we'd still have to do a preliminary alignment and FastTree for the tree pruning step.

As much as I've come to fear/loath dna{ml|pars}, I don't see a compelling reason to switch to PAUP at the moment. For now, we've more or less resolved the name length issues. But if we have further difficulties and still can't get POY to work, it's something to consider.

I'm going to Icebox this for now, but may pull it out if time comes up for looking more at #131.

@metasoarous metasoarous mentioned this issue Mar 10, 2017
@metasoarous
Copy link
Member

Mmm... that didn't take long...

Getting bitten again by dnaml/phylip issues (and mosquitoes), so looking lustily towards poy at the moment. Spent 20 minutes or so trying to get it to compile but no dice; issues with malloc.h header. @matsen were you able to compile it at least, or is this where you were hung up as well? Should we bug the folks at SciComp or see if a wizard like @bcclaywell could compile it?

@matsen
Copy link
Contributor Author

matsen commented Mar 12, 2017

I know that you're all excited because it's written in OCaml, but did you notice that the last commit was in mid 2015? In fact real development stopped much earlier than that, when Andres Varon left for Jane Street Capital (I know this crew reasonably well). They are starting from scratch in Haskell last time I heard from the PI a few summers ago.

So perhaps I shouldn't have suggested it in the first place.

I think that we can do just fine with PAUP. It's quite actively developed, if by an old-schooler, and we can do indels using gap coding using the GapMode flag. Generating NEXUS files is pretty wacky, but at least we only have to make them for the most part. We can consume them using DendroPy.

In any case we should keep in mind that parsimony is some sort of stopgap until we can figure out something better.

@metasoarous
Copy link
Member

metasoarous commented Mar 12, 2017

Don't get me wrong. I do think it's cool that it's written in OCaml. And I think it's even cooler that they're rewriting it in Haskell (even if I think rewrites are an antipattern). But I was more excited by the indel approach. However, based on the commit history and the installation difficulties, I agree that it's not worth it, especially considering PAUP has GapMode as a stopgap. (Now if it had meant I'd be able to write OCaml/Haskell, I might be putting up fight... But honestly I don't really care that much. For the record.)

Interesting overlap with the Jane Street crew! I've heard of them.

Anyway... NEXUS may not be my favorite format, but it sounds like heaven compared to the 💩🎂 we've endured from dnaml/dnapars. So yeah... I'll take a look. But, what do you think the timeline is on "something better"? I can still hack around what we have if the promised land isn't too far off, and it will be less work to do so than to make the switch. The real cost is in continuing to hack around the 💩🚿. Thoughts there?

@matsen
Copy link
Contributor Author

matsen commented Mar 12, 2017

The promised land involves intractable likelihoods and is a research project that we haven't even started yet. So yeah, absolute minimum is a year. Two, probably.

@metasoarous
Copy link
Member

OK; That's rather what I figured. Let's do this!

@metasoarous metasoarous self-assigned this Mar 12, 2017
@matsen
Copy link
Contributor Author

matsen commented Mar 12, 2017

Does PAUP do likelihood ancestral sequence reconstruction as well?

@metasoarous
Copy link
Member

It looks like it...

image

@matsen
Copy link
Contributor Author

matsen commented Mar 12, 2017

Great!

@metasoarous
Copy link
Member

So sad... While PAUP does seem to do ancestral reconstructions, there doesn't seem to be a really good way to export them. This is all I can get it to spit out:

image

There's also this phylobabble thread from 2 years ago posting the same problem (not encouraging).

Also note that all of the trees produced seem to have ?---------A as the root character inference. Which is... wacky.

I could maybe write some code to reconstruct an ancestral state alignment from these ascii trees, but it wouldn't be pretty, and is probably more trouble than it's worth.

Unless things start looking up here somehow, it seems we're stuck hacking around phylip for the moment :-/ Sad!

@matsen
Copy link
Contributor Author

matsen commented Mar 14, 2017

Whoa! Crazy.

I've emailed the author.

@metasoarous
Copy link
Member

Any news @matsen? I'm going to put on ice for now till we hear more or figure out an alternative.

@matsen
Copy link
Contributor Author

matsen commented Mar 21, 2017

No news. I sent the author a note asking him to reply on phylobabble, with no success.

@metasoarous
Copy link
Member

As mentioned in #170, we're considering another tool for ancestral reconstruction (PRANK) that would take a fixed tree and use it to infer sequence ancestry. If that works well, we will have the freedom to look at using PAUP again, without having to worry about the ancestral reconstruction. This may simplify #130.

@matsen
Copy link
Contributor Author

matsen commented May 22, 2017

Just a reminder that if we go this route we'll want to validate it in the same way as Christian and Will are doing now.

@metasoarous
Copy link
Member

As has been stated elsewhere, this is being put on ice until we can figure out what is going on with prank (see updates on #131). The work for RaxML is more or less done, and still in the code via a switch. But we shouldn't activate it until the prank bugs are resolved.

@metasoarous metasoarous removed this from the MB release milestone Jul 27, 2017
@metasoarous
Copy link
Member

As mentioned in #170, we may want to look at http://www.iqtree.org/. See usage link in issues.

@metasoarous
Copy link
Member

As @krdav updated us in #170, IQ-Tree seems to be doing a good job, and is a lot easier and faster to work with than dnaml, so when the time is right, I imagine we'll make the switch.

@metasoarous
Copy link
Member

Update... looks like QTree isn't quite as hot for ancestral sequence reconstruction as we initially thought, upon further investigation from @krdav. As a result, we're bailing and putting this issue back on ice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants