Drugo is a database primarily composed of drug molecules, designed to serve as a benchmark dataset for predicting sites of metabolism. The current goal of this project is to provide a database containing molecular structural information using SMILES strings and structurally assigned literature references. Currently, the priority for the construction of the database is focusing on the Cytochrome P450 (CYP) 3A4 substrate database.
If you are using the Drugo database for academic work, please cite the following original papers:
@article{JChemInfModel2013,
author = {Zaretzki, Jed and Matlock, Matthew and Swamidass, S. Joshua},
title = {XenoSite: Accurately Predicting CYP-Mediated Sites of Metabolism with Neural Networks},
journal = {Journal of Chemical Information and Modeling},
volume = {53},
number = {12},
pages = {3373-3383},
year = {2013},
doi = {10.1021/ci400518g}
}