Ensembl aims to release a new website in 2020 for exploring genomic data. With it, we’d like to produce a regularly updated manual that is available alongside the website, describing the data available and how to access it. This project aims to identify the technologies needed for such a manual, produce a logical structure for the manual, and create a style guide.
Ensembl has vast amounts and types of genomic data with different methods to access it. The documentation should cover what the data is, where it comes from and how we process it, plus how to use the different tools to work with it. This needs to be organised into a logical structure to make it easily accessible for our users.
We release new software and data several times a year, so this manual would need to be dynamic. Writers need to be alerted when changes occur that need to be reflected in the documentation. Previous suggestions for this include using a ghost browser to take screenshots, then producing alerts when image analysis software indicates differences between old and new screenshots.
The documentation will be written by many individuals within and outside the Ensembl project, with varying technical ability. The structure will guide the writers to ensure they are writing relevant and complete documentation, while the style guide will ensure consistency of voice throughout the documentation.
- Identification of suitable technologies for storing, presenting and updating the documentation.
- A general structure listing the sections of the manual noting their approximate content.
- A style guide for the documentation, including how to use highlighting for different types of elements, diagram labelling and spelling.
- Sample documentation on topics that will be unchanged in the new website.
Full-time
- Emily Perry is in charge of the Ensembl Outreach team. Her team deal with things like help emails, training, online help and documentation etc -- anything that involves interacting with people who use Ensembl and helping them along. Her role in this project will be to help define the user's needs for documentation and also to explain the content where necessary.
- Andrea Winterbottom is the Ensembl web designer, in charge of designing the new website. Her role in this project will be in discussing the structure of the new website and its documentation needs, as well as thinking about templates and styles for documentation.
- Andy Yates is the Genomics Technology Infrastructure Group Leader, who oversees work involved in the Ensembl website and its presentation to the public. He will ensure any documentation meets our needs and expectations.
We have a huge raft of documentation on Ensembl data plus contextual help that describes the content of Ensembl web-pages and can be accessed by clicking on help links (?) from the appropriate web-page. All this is rather rather nebulous and has been added piecemeal over time. We would like to create a user-manual, that brings all this together into a single coherent structure. It would need to be organised into an online tree of documentation that could be accessed through Ensembl, with the contextual help still available to access from the appropriate web-pages, and also suitable to be published somewhere like Wellcome Open Research.
To do this, we would need a tree of the correct documentation to be produced, identifying how this would be organised both in web format and in journal format. This would involve auditing the current documentation to work out what we already have and what is missing. This can be considered in the context of the level it is pitched at, plus the purpose of each piece of documentation, if it provides context explaining what something is, a tutorial explaining how to use it, or a reference guide listing what is there.
To produce this documentation, we would need to determine techniologies that will allow us to write documentation, preferably in an easy to write format such as markdown, so that it can go onto both web-pages and journal format without needing to maintain two copies of the text. We would also need to produce labelled diagrams, so would need to find suitable drawing software and software for labelling diagrams.
Ensembl produces releases every 2-3 months, so it would be necessary for the documentation to be easy to maintain and update. A mechanism for highlighting changes where pages need updating would be useful.
The documentation will be written and maintained by a number of different people, so it will be necessary to define the style. Examples of things that might be included in a style guide are:
- Spelling of ambiguous words, such as orthologue/ortholog, RNA-seq/RNA seq/RNAseq.
- Usage of synonymous or nearly synonymous words, such as variant, SNP, mutation, polymorphism, SNV.
- Formatting of particular kinds of features in the text, such as italics on species and gene names, capitalisation of headers, use of format or quote marks to indicate words to click on or type in.
- Paper references.
- For images, how do we label them?
It will be necessary to identify all the aspects of this, including commonly used words, and consult with the team on what the preferred style is.
Here is the Guardian and Observer style guide for reference. They have a lot of words to define because they cover a lot more topics: we can use much fewer.
Anyone interested in the project should complete this short task and send a report in with it.
This page of documentation is contextual help that belongs with the Region in Detail view. For this page please lay out:
- What sort of documentation this is? Who is it for and how they might use it?
- If this page of documentation was a section of an Ensembl manual, what super-header would you imagine that it would come under? What other things would you expect to come in that section?
- What elements in this page would require a style guide? Make a list of all the items where a style has been chosen and describe that style succinctly.
- Critique this page. Is it clear? What is missing?