Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaf names are abbreviated, I think #138

Closed
lauradoepker opened this issue Feb 2, 2017 · 6 comments
Closed

Leaf names are abbreviated, I think #138

lauradoepker opened this issue Feb 2, 2017 · 6 comments
Assignees

Comments

@lauradoepker
Copy link

Hi @metasoarous,
I think the leaf names are abbreviated on the trees, maybe so that the mf %s can be there too? Anyway, when I try to find the original sequence for a given sequence number, it turns out that it doesn't exist as-is... but there are longer sequence names that include the leaf name that do exist. Did I explain the problem well enough?

Just curious if you're aware of this. Not a current problem for us, but should be considered for future platforms.

@metasoarous
Copy link
Member

It shouldn't be doing this as far as I know. Can you point to some specific examples? If the sequence names are truncated, the corresponding links would likely be broken; Is that something you're seeing?

@metasoarous
Copy link
Member

This might be because of Phylip trimming sequence names? There is trimming going on elsewhere (for the seeds, right?), but that ends up getting reversed. It's possible we just need to add reversal or some other more general coding scheme to get around the limitation. Thoughts @wsdewitt?

@metasoarous metasoarous mentioned this issue Feb 15, 2017
@metasoarous
Copy link
Member

@lauranoges I'm guessing you're back from your trip? Would you mind taking a look at this again to see if you can find any specific examples of this?

@lauradoepker
Copy link
Author

I just looked through stoat:5002 and I can no longer find any examples, so let's close this issue. Sorry to cause a fuss! @metasoarous

@metasoarous
Copy link
Member

@lauranoges No worries! It may have a been an older version of the code messing things up. Better to raise the issue and check than let the bugs sneak under the rug :-)

@metasoarous
Copy link
Member

@lauranoges It looks like when seed sequences cluster together, the stuff Will wrote to fix up the sequence names only fixes the "main" seed sequence. So the other seed sequence name does come out the other end trimmed. This is thwarting some of my work on #99, and prompted a push on #149, but unfortunately, neither of the alternatives we looked at in #149 seem to be panning out well. Not sure yet whether I'm going to patch up over this downstream of dnaml/dnapars for #99 specifically or fix this once and for all at the source, but in any case, this issue should be reopened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants