Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No add species with complete deletions #14

Open
YellowStar96 opened this issue Nov 17, 2022 · 4 comments
Open

No add species with complete deletions #14

YellowStar96 opened this issue Nov 17, 2022 · 4 comments

Comments

@YellowStar96
Copy link

YellowStar96 commented Nov 17, 2022

Dear Hiller,
Thanks for sharing these useful codes and resources!
I'm running an analysis on the non-coding elements. When used the "Maf2SpanningSeq_PRANK.perl" for reconstruct ancestor alignment, I find 2 species with complete deletions in a .maf file (represents one non-coding element) can't be regonized, and the reconstructed fasta didn't contain these 2 species. The following is the empty information.
image

my command lines are as follows:

  • mafAddIRows Chr22.sort.maf Hsap.2bit Chr22.addI.maf
  • mafIndex ./Chr22.addI.maf ./Chr22.addI.maf.bb -chromSizes=Chrom.sizes
  • mafExtract Chr22.addI.maf.bb test.maf -regionList=test.bed -leaveEdgeMeta
  • ~/Software/ForwardGenomics/scripts/Maf2SpanningSeq_PRANK.perl test.maf test -runPrank -treeFile tree.nh -keepTemporaryFiles -BDBFile testBDB -twoBitPath <path to 2bit folder>

I was wondering why the complete deletion species were not include in the prank analysis, and how to solve that?

Thank you for your time and help,

Best Regards,
Xin Huang

@MichaelHiller
Copy link

Hi Xin,

pls check if the 2 'missing' species that have a complete deletion are included in the maf file with an e line for each maf block. The script reads the species provided from the maf file, not from the tree.

If the species are contained in the maf file, pls send them to me to debug.

Michael

@YellowStar96
Copy link
Author

Hi Hiller,
Thanks for your prompt reply.
I would like to know how to add the "e line" for each maf block, are there any scripts and tools that i can use? My .maf file only contain the "s line" and "i line" (add by mafAddIRows) now.

Best regards,
Xin Huang

@MichaelHiller
Copy link

mafAddIRows adds i and e lines.
But maybe there was no chain/net that spans the your element in these two genomes.
This can happen if you have one scaffold that stops before the element and another scaffold that starts downstream and there is nothing that aligns to the element.
In this case, I would consider the element as missing from the genome assembly. It could be a rearrangement and your element is truly lost from the genome OR (and this is more likely unless you have a highly complete genome) there is simply a region missing from your assembly.

If that is the case, mafAddIRows would not add anything for the species.

If you believe the element is truly lost in both species, maybe manually add them to the prank output or the maf.

@YellowStar96
Copy link
Author

Dear Hiller,
Pretty thanks for your patient reply again, i would check my .maf file agian.

Best regards,
Xin Huang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants