Skip to content

Latest commit

 

History

History
33 lines (19 loc) · 2.88 KB

Havana.md

File metadata and controls

33 lines (19 loc) · 2.88 KB

Manual gene annotation documentation

Brief explanation

Ensembl genes are annotated onto the genome by a combination of automatic annotation using a pipeline and manual annotation by skilled annotators. Current documentation describing the process of manual gene annotation is very limited and needs significant expansion.

Manual gene annotation is a very involved process, which uses data from a variety of sources, along with the annotators’ expert knowledge of gene structures, to determine the position of genes. Annotators make use of their own annotators’ guidelines, which may form the basis of any documentation. They have specialised software to carry out the task, which is not available to the public.

This project would aim to produce the documentation that will allow Ensembl users to understand where their genes came from. The source of genes and the reason for any changes in our gene models are some of the most popular topics on our email helpdesk. The documentation would not help people to annotate genes in Ensembl themselves.

Expected results

  • Documentation pages describing the process of manual gene annotation.
  • Images illustrating the process of manual gene annotation.

Commitment

Full-time

Mentors

Erin Haskell is an Outreach Officer for Ensembl. Her role involves supporting people who use Ensembl, both through face-to-face and online training, as well as through the helpdesk and social media channels. Her role in this project will be to offer guidance on the structure, content, and appropriate audience of the documentation.

Jonathan Mudge and Jane Loveland are Annotation Project Leaders in the Ensembl-Havana group who are involved in managing the manual gene annotation team, and in addition refining and extrapolating annotation concepts throughout the Ensembl website. Their involvement will be ensuring that the biological content of the documents are correct and in accordance with the teams scientific policies.

Sample task

Anyone interested in the project should complete this short task and send a report in with it.

The current documentation that we have on manual gene annotation can be found on two sites, first in Ensembl and secondly on the GENCODE website in the FAQs and Documentation links. For these pages please consider the following points:

  1. Do you see any weaknesses or strengths in these documents in their current forms? What can you identify as missing in this documentation?
  2. If you have identified any weaknesses, which would be the most important to address and why? Provide a brief plan (bullet points are fine) of how you would go about addressing them.
  3. Looking at other genome browsers, how comparable is their gene annotation documentation to Ensembl/GENCODE?