Predictive understanding of ecosystem response to change has become a pressing societal need in the Anthropocene, and requires integration across disciplines, spatial scales, and timeframes. Developing a framework for understanding how different biological systems interact over time is a major challenge in biology. The National Science Foundation-funded EMergent Ecosystem Responses to ChanGE (EMERGE) Biology Integration Institute aims to develop such a framework by integrating research, training, and high-resolution field and laboratory measurements across 15 scientific subdisciplines–including ecology, physiology, genetics, biogeochemistry, remote sensing, and modeling–across 14 institutions, in order to understand ecosystem-climate feedbacks in Stordalen Mire, a thawing permafrost peatland in arctic Sweden. Rapid warming in the Arctic is driving permafrost thaw, and new availability of formerly-frozen soil carbon for cycling and release to the atmosphere, representing a potentially large but poorly constrained accelerant of climate change. This material is based upon work supported by the National Science Foundation under Grant Number 2022070.
Listed below are a number of the tools that members have developed for better understanding and integration of these datasets.
Tool | Description | Developers | Citation |
---|---|---|---|
CoverM | Metagenomic coverage calculator / BAM file generator | Ben Woodcroft (CMR) | Aroney, S. T. N., Newell, R. J. P., Nissen, J., Camargo, A. P., Tyson, G. W., & Woodcroft, B. J. (2024). CoverM: Read alignment statistics for metagenomics (Version 0.7.0) [Computer software]. https://doi.org/10.5281/zenodo.10531253 |
RecurM | Homology-independent discovery of mobile genetic elements | Alex Chklovski (CMR) | |
Lorikeet | Microbial strain resolver, coverage calculator, variant caller, selective pressure calculator | Rhys Newell (CMR) | Newell, R. J. P., McMaster, E. S., Craig, P., Boden, M., Tyson, G. W., & Woodcroft, B. J. (2023). Lorikeet: strain-resolved metagenome analysis using local reassembly (Version 0.8.2) [Computer software]. https://doi.org/10.5281/zenodo.10275469 |
Rosella | Metagenomic binning and bin refinement tool | Rhys Newell (CMR) | Newell, R. J. P., Tyson, G. W., & Woodcroft, B. J. (2023). Rosella: Metagenomic binning using UMAP and HDBSCAN (Version 0.5.3) [Computer software]. https://doi.org/10.5281/zenodo.10460259 |
Aviary (incorporated SlamM) | Microbial genome recovery pipeline with novel methods for long/short read assembly | Rhys Newell (CMR) | Newell, R. J. P., Aroney, S. T. N., Zaugg, J., Sternes, P., Tyson, G. W., & Woodcroft, B. J. Aviary: Hybrid assembly and genome recovery from metagenomes. Zenodo. https://doi.org/10.5281/zenodo.10158086 |
Galah | Genome dereplication | Ben Woodcroft (CMR) | Aroney, S. T. N., Camargo, A. P., Tyson, G. W., & Woodcroft, B. J. (2024). Galah: More scalable dereplication for metagenome assembled genomes (Version 0.4.2) [Computer software]. https://doi.org/10.5281/zenodo.10526086 |
Kingfisher | Public sequence and metadata gatherer | Ben Woodcroft (CMR) | Woodcroft, B. J., Cunningham, M., Gans, J. D., Bolduc, B. B., & Hodgkins, S. B. (2024). Kingfisher: A utility for procurement of public sequencing data (Version 0.4.1) [Computer software]. https://doi.org/10.5281/zenodo.10525085 |
SingleM | De-novo OTUs from shotgun metagenomes | Ben Woodcroft (CMR) | Ben J. Woodcroft, Samuel T. N. Aroney, Rossen Zhao, Mitchell Cunningham, Joshua A. M. Mitchell, Linda Blackall, Gene W. Tyson. SingleM and Sandpiper: Robust microbial taxonomic profiles from metagenomic data bioRxiv 2024.01.30.578060; doi: https://doi.org/10.1101/2024.01.30.578060 |
GraftM | Meta-omic tool that identifies and classifies marker and functional genes | Ben Woodcroft (CMR) | Boyd, J.A., Woodcroft, B.J. and Tyson, G.W., 2018. GraftM: a tool for scalable, phylogenetically informed classification of genes within metagenomes. Nucleic Acids Research, 46(10), pp.e59-e59. https://doi.org/10.1093/nar/gky174 |
DRAM | Annotates MAGs and summarizes metabolic potential | Mikayla Borton (CSU), Mike Shaffer (CSU), Kelly Wrighton | Shaffer, M., Borton, M.A., McGivern, B.B., Zayed, A.A., La Rosa, S.L., Solden, L.M., Liu, P., Narrowe, A.B., Rodríguez-Ramos, J., Bolduc, B. and Gazitúa, M.C., 2020. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic acids research, 48(16), pp.8883-8900. https://doi.org/10.1093/nar/gkaa621 |
Shaffer, M., Borton, M.A., Bolduc, B., Faria, J.P., Flynn, R.M., Ghadermazi, P., Edirisinghe, J.N., Wood-Charlson, E.M., Miller, C.S., Chan, S.H.J. and Sullivan, M.B., 2023. kb_DRAM: annotation and metabolic profiling of genomes with DRAM in KBase. Bioinformatics, 39(4), p.btad110. https://doi.org/10.1093/bioinformatics/btad110 | |||
Phylogenetic Null Modeling | partitioning variation in phylogenetic data and attributing to assembly processes | Stacey Doherty (UNH alum), Jessica Ernakovich (based on Stegen et al., 2013) | |
vConTACT2 | Classifies and clusters viral sequences into approx. genus groups | Ben Bolduc (OSU), Sullivan Lab | Bin Jang, H., Bolduc, B., Zablocki, O., Kuhn, J.H., Roux, S., Adriaenssens, E.M., Brister, J.R., Kropinski, A.M., Krupovic, M., Lavigne, R. and Turner, D., 2019. Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nature biotechnology, 37(6), pp.632-639. https://doi.org/10.1038/s41587-019-0100-8 |
vConTACT3 | Classifies and clusters viral sequences into approx. genus groups (major rewrite of vConTACT2) | Ben Bolduc (OSU), Sullivan Lab | |
VirSorter1 | Identifies viral sequences in microbial and viral sequence data | Simon Roux (JGI), Sullivan Lab | Roux, S., Enault, F., Hurwitz, B.L. and Sullivan, M.B., 2015. VirSorter: mining viral signal from microbial genomic data. PeerJ, 3, p.e985. https://doi.org/10.7717/peerj.985 |
VirSorter2 | As VirSorter1, but uses ML and expands viral types detected | Jiarong Guo (OSU), Sullivan Lab | Guo, J., Bolduc, B., Zayed, A.A., Varsani, A., Dominguez-Huerta, G., Delmont, T.O., Pratama, A.A., Gazitúa, M.C., Vik, D., Sullivan, M.B. and Roux, S., 2021. VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses. Microbiome, 9, pp.1-13. https://doi.org/10.1186/s40168-020-00990-y |
MetaPop | calculates macro- and micro-diversity metrics | Ann Gregory (OSU alum), Sullivan Lab | Gregory, A.C., Gerhardt, K., Zhong, Z.P., Bolduc, B., Temperton, B., Konstantinidis, K.T. and Sullivan, M.B., 2022. MetaPop: a pipeline for macro-and microdiversity analyses and visualization of microbial and viral metagenome-derived populations. Microbiome, 10(1), p.49. https://doi.org/10.1186/s40168-022-01231-0 |
ecosys | site, landscape scale process-based terrestrial ecosystem model | William Riley (LBL), Zhen Li (LBL), Robert Grant (Alberta) | Li, Z., Riley, W., Marschmann, G., Karaoz, U., Shirley, I., Wu, Q., Bouskill, N., Chang, K., Crill, P., Grant, R., King, E., Saleska, S., Sullivan, M., Tang, J., Varner, R., Woodcroft, B., Wrighton, K., and Brodie, E., 2024. A framework for integrating genomics, microbial traits, and ecosystem biogeochemistry. https://doi.org/10.21203/rs.3.rs-4966902/v1 |
CheckM 2 | Microbial genome completeness and contamination estimation | Alex Chklovski (CMR) | Chklovski, A., Parks, D.H., Woodcroft, B.J. and Tyson, G.W., 2023. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nature Methods, 20(8), pp.1203-1212. https://doi.org/10.1038/s41592-023-01940-w |