micronota is an open-source, BSD-licensed package to annotate microbial genomes and metagenomes.
As Python 3 matures and majority python packages support Python 3, the scientific Python community is in favor of dropping Python 2 compatibility. Thus, micronota will only support Python 3. This will allow micronota to have few dependency and avoid maintenance of Python 2 legacy code.
micronota can annotate multiple features including coding genes, prophage, CRISPR, tRNA, rRNA and other ncRNAs. It has a customizable framework to integrate additional tools and databases. Generally, the annotation can be classified into 2 categories: structural annotation and functional annotation. Structural annotation is the identification of the genetic elements on the sequence and functional annotation is to assign functions to those elements.
To install the latest release of micronota:
conda install micronota
Or you can install through pip
:
pip install micronota
To install the latest developping version:
pip install git+git://github.com/biocore/micronota.git
To prepare (download and format) the files of TIGRFAM to the right form read by micronota:
micronota database prepare tigrfam --cache_dir ~/database
To check the micronota setup, you can run:
micronota info
It will print out the system info, databases available, external tools, and other configuration info.
There are 3 types of config files user can set for workflow, logging, and parameters, respectively. All of the 3 config files can be specified on the command line (--cfg
, --log
, and --param
) to override the default settings. Besides, you can also put the frequently used settings into global config files. The global config files should be put in the config directory of micronota. On linux it is usually ~/.config/micronota
; on Mac, it is usually ~/Library/Application Support/micronota
. You can also find the directory by running the following command
python -c "import click; print(click.get_app_dir('micronota'))"
If the directory does not exist, just simply create it with mkdir
command.
It is always good to confirm your settings by printing out the setup:
micronota info # OR if you provide it on command line: micronota --cfg misc.cfg --param param.cfg info
This is how default workflow config looks like. You can copy and modify that to create your own config file either put in the config dir (with file name of misc.cfg
) or provided in command with --cfg
. micronota will only read one workflow config file, which is the first one it finds in the order of command line > global > default.
Here is an example modified that you can use on command line or global config dir:
[general] # use another dir as database dir db_dir = better/db [feature] # run prodigal first prodigal = 1 # don't run infernal infernal = 0 # next to annotate CDS [cds] # run diamond tegother with uniref50 diamond = uniref50
The format of the config file is widely used in different OS platforms and described here. 0
/ 1
, no
/ yes
, false
/ true
, on
/ off
can all be used to turn off or on each tool. If the tool need a database file to run with, specify the database instead of the indicator.
This is how default logging config looks like. It is used to config logging utilitiy to print out useful info. You can change logging config similarly as you do for workflow config. The global file should be named as log.cfg
in the config dir if you plan to define global logging config.
For example, if you want to reduce the verbosity of logging, you can change level to ERROR
in your global logging config file:
[loggers] keys=root [handlers] keys=consoleHandler [formatters] keys=simpleFormatter [logger_root] level=ERROR handlers=consoleHandler [handler_consoleHandler] class=StreamHandler formatter=simpleFormatter args=(sys.stdout,) [formatter_simpleFormatter] format=%(asctime)s %(name)s %(levelname)s %(message)s datefmt=%Y-%m-%d %H:%M:%S
The parameter config is used to tune the parameters of each external tools. This is how the default parameter config looks like. You can specify the parameter for each individual tools. For example, if you want to run Prodigal with genetic translation table 1, instead of the default translation table, you can create a file param.cfg:
[prodigal] # set translation table to 1 -t = 1
Different from the other 2 config files, all the param config files will be read by micronota in the order of default, global and command line param config, with the following one overriding its previous.
Features | Supported | Tools |
---|---|---|
coding gene | yes | Prodigal |
tRNA | ongoing | Aragorn |
ncRNA | yes | Infernal |
CRISPR | ongoing | MinCED |
ribosomal binding sites | ongoing | RBSFinder |
prophage | ongoing | PHAST |
replication origin | todo | Ori-Finder 1 (bacteria) & Ori-Finder 2 (archaea) |
microsatellites | todo | nhmmer? |
signal peptide | ongoing | SignalP |
transmembrane proteins | ongoing | TMHMM |
Databases | Supported |
---|---|
TIGRFAM | yes |
UniRef | yes |
Rfam | ongoing |
To get help with micronota, you should use the micronota tag on Biostars. The developers regularly monitor the micronota
tag on Biostars.
If you’re interested in getting involved in micronota development, see CONTRIBUTING.md.
See the list of micronota’s contributors.
micronota is available under the new BSD license. See COPYING.txt for micronota’s license, and the licenses directory for the licenses of third-party software and databasese that are (either partially or entirely) distributed with micronota.