Skip to content

Commit

Permalink
Adding Sphinx-based manual.
Browse files Browse the repository at this point in the history
  • Loading branch information
holtgrewe committed Jan 14, 2015
1 parent ba9a2c1 commit ef03eef
Show file tree
Hide file tree
Showing 10 changed files with 332 additions and 17 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,5 @@ cov-int*
*.vcf
*.vcf.idx
data
# built manual
manual/_build
45 changes: 45 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
Building the Manual
===================

The manual is written using Sphinx [1] and hosted on ReadTheDocs [2]. We use
the ReadTheDocs theme [3] for pretty HTML output.

Prerequisites
-------------

You need to install the sphinx command in order to build the application. The
most convenient way to do this is using the pip command in a virtualenv
environment. This section describes how to set this up.

First, install virtualenv:

# sudo apt-get install virtualenv

Then, create a virtualenv environment:

# virtualenv ~/local/virtualenv

Enable this environment:

# . ~/local/virtualenv/bin/activate

Now, you can use pip for installing the packages described in
manual/requirements.txt into ~/local/virtualenv without affecting your global
Python installation:

# pip install -r manual/requirements.txt

Create HTML Pages
-----------------

Now, create the HTML pages. The results can be found in manual/_build/html

# cd manual
# make html

Links
-----

[1] http://sphinx-doc.org/
[2] https://readthedocs.org/
[3] https://github.com/snide/sphinx_rtd_theme
32 changes: 32 additions & 0 deletions manual/annotate_pos.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
.. _annotate_pos:

Annotating Positions
====================

Sometimes, it is useful to annotate a single position only, for example for quick checks or for debugging purposes.
You can do this using the ``annotate-pos`` command of Jannovar.

You have to pass a path to a annotation database file and one or more chromosomal change specifiers.
Jannovar will then return the effect and the HGVS annotation for each chromosomal change.

.. code-block:: console
# java -jar jannovar-cli-0.10.jar annotate-pos data/hg19_ucsc.ser 'chr1:12345C>A'
[...]
#change effect hgvs_annotation
chr1:12345C>A INTRONIC DDX11L1:uc010nxq.1:c.38+118C>A
The format for the chromsomal change is as follows:

.. code-block:: console
{CHROMOSOME}:{POSITION}{REF}>{ALT}
CHROMOSOME
name of the chromosome or contig
POSITION
position of the first change base on the chromosome; in the case of insertions the first base after the insertion; the first base on the chromosome has position ``1``
REF
the reference bases
ALT
the alternative bases
25 changes: 25 additions & 0 deletions manual/annotate_vcf.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
.. _annotate_vcf:

Annotating VCF Files
====================

The main purpose of Jannovar is the annotation of all variants in a VCF file.
That is, for each annotation, predict the results for all transcripts that can be afflicted by the change.
Depending on the configuration, the one effect that is most pathogenic, or all, are written out.

This is done using the ``annotate`` command.
You pass the path to an annotation database and one or more paths to VCF files that are to be annotated.
For each file, the resulting annotated file is to the current directory, the file name is derived by replacing the file name suffix ``.vcf`` to ``.jv.vcf``.

For example, for annotating the ``pfeiffer.vcf`` file in the ``examples`` directory:

.. code-block:: console
# java -jar jannovar-cli/target/jannovar-cli-0.10.jar annotate data/hg19_ucsc.ser examples/pfeiffer.vcf
[...]
# ls
[...]
pfeiffer.jv.vcf
.. note:: TODO: describe Jannovar format
.. note:: TODO: describe show-all option
6 changes: 4 additions & 2 deletions manual/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
import sys
import os

import sphinx_rtd_theme

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
Expand Down Expand Up @@ -101,15 +103,15 @@

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'default'
html_theme = 'sphinx_rtd_theme'

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}

# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]

# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
Expand Down
50 changes: 50 additions & 0 deletions manual/download.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
.. _download:

Downloading Transcript Databases
================================

The first step after installing Jannovar is to obtain a **transcript database**.
This database stores information about the transcripts, such as the location of a transcript and its exons, its CDS start and end position, and the transcript sequence.
There are three major sources of annotation databases for the main model organisms: (1) the UCSC genome browser, (2) the Ensembl project, and (3) the RefSeq database at NCBI.
Each database is linked to a certain release of a reference genome.

Displaying Available Database
-----------------------------

.. note:: TODO: link to writing your own INI file

Jannovar has built-in support for the human and mouse genomes in releases ``hg18``, ``hg19``, ``hg38``, ``mm9``, and ``mm10``.
For each release, the database can originate from the sources ``ucsc``, ``ensembl``, and ``refseq``.
Further, the database can be limited to the curated transcripts only when using RefSeq: ``refseq_curated``.

The genome release names and the source names are joint into database descriptors such as ``hg19/ucsc`` and ``hg38/refseq``.
You can view the built-in database names using the ``db-list`` Jannovar command:

.. code-block:: console
# java -jar jannovar-cli/target/jannovar-cli-0.10.jar db-list
[...]
hg18/refseq_curated
hg19/ucsc
[...]
Database Download
-----------------

A database can be downloaded using the ``download`` command.
You can pass a list of database source names to this command.
For each, Jannovar will download the database files over the network to the directory ``data/${source}``
This directory is created if necessary.
When a to be downloaded file already exists, Jannovar will not attempt to overwrite this file.

.. note:: If you have problems with downloading files and later on building the database fails then you should delete the directory ``data/${source}`` and retry downloading the file.

Finally, Jannovar will build a file with the extension ``.ser`` in the directory ``data``, e.g. ``data/hg19_ucsc.ser``.

Let us now download the RefSeq and UCSC annotations for human release *hg19*:

.. code-block:: console
# java -jar jannovar-cli/target/jannovar-cli-0.10.jar download hg19/refseq hg19/ucsc
28 changes: 13 additions & 15 deletions manual/index.rst
Original file line number Diff line number Diff line change
@@ -1,22 +1,20 @@
.. Jannovar documentation master file, created by
sphinx-quickstart on Wed Jan 14 15:41:05 2015.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Jannovar Manual
===============

Welcome to Jannovar's documentation!
====================================
Jannovar is a Java-based program and library for the functional annotation of VCF files.

Contents:
.. note:: TODO: describe writing your own INI file
.. note:: TODO: port over stuff from old Tutorial
.. note:: TODO: describe proxy settings
.. note:: TODO: describe java memory settings

.. toctree::
:maxdepth: 2



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
quickstart
install
download
annotate_vcf
annotate_pos
license

80 changes: 80 additions & 0 deletions manual/install.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
.. _install:

Installation
============

There are two options of installing Jannovar.
The recommended way for most users is to download a prebuilt binary and is well-described in the :ref:`quickstart` section.
This section describes how to build Jannovar from scratch.

Prequisites
-----------

For building Jannovar, you will need

#. `Java JDK 6 or higher <http://www.oracle.com/technetwork/java/javase/downloads/index.html>`_ for compiling Jannovar,
#. `Maven 3 <http://maven.apache.org/>`_ for building Jannovar, and
#. `Git <http://git-scm.com/>`_ for getting the sources.

Git Checkout
------------

In this tutorial, we will download the Jannovar sources and build them in ``~/Development/jannovar``.

.. code-block:: console
~ # mkdir -p ~/Development
~ # cd ~/Development
Development # git clone https://github.com/charite/jannovar.git jannovar
Development # cd jannovar
Maven Proxy Settings
--------------------

If you are behind a proxy, you will get problems with Maven downloading dependencies.
If you run into problems, make sure to also delete ``~/.m2/repository``.
Then, execute the following commands to fill ``~/.m2/settings.xml``.

.. code-block:: console
jannovar # mkdir -p ~/.m2
jannovar # test -f ~/.m2/settings.xml || cat >~/.m2/settings.xml <<END
<settings>
<proxies>
<proxy>
<active>true</active>
<protocol>http</protocol>
<host>proxy.example.com</host>
<port>8080</port>
<nonProxyHosts>*.example.com</nonProxyHosts>
</proxy>
</proxies>
</settings>
END
Building
--------

You can build Jannovar using ``mvn package``.
This will automatically download all dependencies, build Jannovar, and run all tests.

.. code-block:: console
jannovar # mvn package
In case that you have non-compiling test, you can use the `-DskipTests=true` parameter for skipping them.

.. code-block:: console
jannovar # mvn install -DskipTests=true
Creating Eclipse Projects
-------------------------

Maven can be used to generate Eclipse projects that can be imported by the Eclipse IDE.
This can be done calling ``mvn eclipse:eclipse`` command after calling ``mvn install``:

.. code-block:: console
jannovar # mvn install
jannovar # mvn eclipse:eclipse
32 changes: 32 additions & 0 deletions manual/license.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
.. _license:

Jannovar License
================

Jannovar is licensed under the BSD License:

.. code-block:: text
Copyright (c) 2013, Charite Universitaetsmedizin Berlin
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
49 changes: 49 additions & 0 deletions manual/quickstart.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
.. _quickstart:

Quickstart
==========

This short How-To guides you from downloading the Jannovar program to annotating a VCF file in 5 steps.

#. Download the current stable release from our `GitHub project <https://github.com/charite/jannovar>`_ by clicking `here <https://github.com/charite/jannovar/releases/download/v0.11.0/jannovar-0.11.0.zip>`_.
#. Extract the ZIP archive.

* you should find file called ``jannovar-cli-0.11.jar`` in the ZIP
* you should also find a file ``pfeiffer.vcf`` file in the folder ``examples``

#. Download the `RefSeq <http://www.ncbi.nlm.nih.gov/refseq/>`_ transcript database for the release *hg19/GRCh37*.

.. code-block:: console
# java -jar jannovar-cli-0.11.jar download hg19/refseq
This will create the file ``data/hg19_refseq.ser`` which is a self-contained transcript database and can be used for functional annotation.
#. Annotate the file ``pfeiffer.vcf`` from the ``examples`` directory.

.. code-block:: console
# java -jar jannovar-cli-0.11.jar annotate data/hg19_refseq.ser examples/pfeiffer.vcf
Jannovar will now load the transcript database from ``data/hg19_refseq.ser`` and then read ``examples/pfeiffer.vcf`` file.
Each contained variant in this file will be annotated with an ``EFFECT`` and an ``HGVS`` field in the ``VCF`` info column.
The ``EFFECT`` field contains an effect, e.g., ``SYNONYMOUS`` and the ``HGVS`` field contains a HGVS representation of the variant.
The result will be written out to ``pfeiffer.jv.vcf``.

The following excerpt shows the first three variants of the ``pfeiffer.vcf`` file with their effect and HGVS annotation.

.. code-block:: text
1 866511 rs60722469 C CCCCT 258.62 PASS EFFECT=INTRONIC;HGVS=SAMD11:NM_152486.2:c.305+42_305+43insCCCT GT:AD:DP:GQ:PL 1/1:6,5:11:14.79:300,15,0
1 879317 rs7523549 C T 150.77 PASS EFFECT=MISSENSE;HGVS=SAMD11:XM_005244727.1:exon9:c.799C>T:p.Arg267Cys GT:AD:DP:GQ:PL 0/1:14,7:21:99:181,0,367
1 879482 . G C 484.52 PASS EFFECT=MISSENSE;HGVS=SAMD11:XM_005244727.1:exon9:c.964G>C:p.Asp322His GT:AD:DP:GQ:PL 0/1:28,20:48:99:515,0,794
Next Steps
----------

Of course, you can follow the other manual chapters and get more extensive information on Jannovar.
In addition, here are some external links that can help you in your understanding:

Current VCF Specification
can be found in the **hts-specs** project on GitHub `here <https://github.com/samtools/hts-specs>`_
HGVS Mutation Nomenclature.
is mainainted by the `Human Genome Variation Society <http://www.hgvs.org/>`_ and the nomenclature can be found `here <http://www.hgvs.org/mutnomen/>`_.

0 comments on commit ef03eef

Please sign in to comment.