You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have rerun the annotations pipeline on the newest annotations.
There are some new changes in the recent ensembl releases: Whereas the GTF still has "ENSPXXX" notation, the .pep. file and the .cdna. file now have transcript/protein notation with a suffix: "ENSPXXX.1" etc. Because there are multiple steps at which these tables are combined we need to find a solution to this:
Options:
Remove ".1" suffix when loading these databases (I have implemented this)
Create Additional Columns
While looking at this I found another error that used to fail the pipeline. Pipeline_annotations relies on peptide2cdna methods, which in turn rely on an ancient library written by Adnreas: alignment_light. The library still exists but has been massively rewritten. Given that the cdna/peptide functionality has not been used I have taken the entirety of these function and methods out, as they need to be rewritten from scratch if at all needed (what was its purpose?). The cdna fasta is now made from ensembl cdna data only.
Hi @jscaber , thanks, taking these out is fine - they were mostly useful for gene-prediction tasks and comparative genomics, not so much our focus now.
I have rerun the annotations pipeline on the newest annotations.
There are some new changes in the recent ensembl releases: Whereas the GTF still has "ENSPXXX" notation, the .pep. file and the .cdna. file now have transcript/protein notation with a suffix: "ENSPXXX.1" etc. Because there are multiple steps at which these tables are combined we need to find a solution to this:
Options:
While looking at this I found another error that used to fail the pipeline. Pipeline_annotations relies on peptide2cdna methods, which in turn rely on an ancient library written by Adnreas: alignment_light. The library still exists but has been massively rewritten. Given that the cdna/peptide functionality has not been used I have taken the entirety of these function and methods out, as they need to be rewritten from scratch if at all needed (what was its purpose?). The cdna fasta is now made from ensembl cdna data only.
see Pull Request #318
Also See Pull Request in CGATPipelines CGATOxford/CGATPipelines#312
The text was updated successfully, but these errors were encountered: