IMG Omics Annotation WDL

Setup

Requirements

Cromwell (and associated Java dependencies)
Input binaries, databases, etc. specified in the inputs.json file

Docker image

This pipeline uses a docker image. You can see the dockerfile and latest tag(version) here:
https://hub.docker.com/repository/docker/bfoster1/img-omics

Input

Input options are listed in the inputs.json file and include locations of input databases, locations of necessary executables, and Boolean variables to toggle run options on and off.

Currently, the inputs have some executables' paths written in using the absolute system paths on Cori. Additionally, it is assumed that there is a bin directory at the same level as the wdl files which contains numerous scripts for different parts of the annotations. On Cori at NERSC, this directory is /global/dna/projectdirs/microbial/omics/gbp/img-annotation-pipeline/bin and can be copied here into this directory to be used in a relative way.

The workflow is written to take in a metagenome or isolate FASTA which is split into some number of smaller files. The directory structure looks like:

/path/to/splits/<split number>/file.fna

The splits should go from 1 to N, and N should be specified in inputs.json as the variable metagenome_annotation.num_splits. The workflow assumes that all FASTA splits have the same file name. So, for example, if we have an example FASTA example.fna split into 3, the directory structure would look like:

/path/splits/1/example.fna
/path/splits/2/example.fna
/path/splits/3/example.fna

where "example" should be set as the value of the variable annotation.imgap_project_id and /path/splits/ should be set as the value of the variable annotation.imgap_input_dir.

Instructions

Edit inputs.json to pick whichever parts of the annotation pipeline(s) should be run. Make sure that the locations of the input databases and executables are correctly set as well.

The annotation workflow is structured:

|annotation
|- setup
|- structural annotation
|-- pre-qc
|-- trnascan
|-- rfam
|-- crt
|-- prodigal
|-- genemark
|-- gff_merge
|-- fasta_merge
|-- gff_and_fasta_stats
|-- post-qc
|- functional annotation
|-- ko_ec
|-- smart
|-- cog
|-- tigrfam
|-- superfamily
|-- pfam
|-- cath_funfam
|-- signalp
|-- tmhmm
|-- product_name_assign

Run the workflow with the command:

java -jar <Cromwell> annotation.wdl -i inputs.json

in this directory.

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
OldJsons		OldJsons
Scripts		Scripts
omics		omics
.gitignore		.gitignore
README.md		README.md
annotation.wdl		annotation.wdl
cmds		cmds
cromwell-cori.conf		cromwell-cori.conf
cromwell-cori.slurm.conf		cromwell-cori.slurm.conf
cromwell-lbl.slurm.conf		cromwell-lbl.slurm.conf
crt.json		crt.json
crt.wdl		crt.wdl
functional-annotation.wdl		functional-annotation.wdl
genemark.wdl		genemark.wdl
inputs.shifter.brian		inputs.shifter.brian
marcel.144.lrcs.clean.json		marcel.144.lrcs.clean.json
marcel.2.lrcs.clean.json		marcel.2.lrcs.clean.json
marcel.4.lrcs.clean.json		marcel.4.lrcs.clean.json
prodigal.wdl		prodigal.wdl
rfam.wdl		rfam.wdl
shifter.conf		shifter.conf
singularity.conf		singularity.conf
singularity_exec.sh		singularity_exec.sh
split.1.lrcs.refdata.json		split.1.lrcs.refdata.json
split.1.shard.json		split.1.shard.json
split.2.compare.json		split.2.compare.json
split.50.compare.json		split.50.compare.json
splits.tar		splits.tar
structural-annotation.wdl		structural-annotation.wdl
test.json		test.json
trnascan.wdl		trnascan.wdl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMG Omics Annotation WDL

Setup

Requirements

Docker image

Input

Instructions

About

Releases

Packages

Contributors 3

Languages

kellyrowland/img-omics-wdl

Folders and files

Latest commit

History

Repository files navigation

IMG Omics Annotation WDL

Setup

Requirements

Docker image

Input

Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages