Skip to content
Geun Ho Gu edited this page Dec 8, 2016 · 19 revisions

Welcome to the VlachosGroupAdditivity wiki!

This package contains capability to load a group additivity scheme of your interest, and estimate molecules' thermodynamic properties. This package uses SMILES string for the canonical description of molecules, and needs RDKIT installed to use.

A simple Benson's group additivity example is shown here:

In:
from VGA.GroupAdd.Library import GroupLibrary
import VGA.ThermoChem
lib = GroupLibrary.Load('benson')
descriptors = lib.GetDescriptors('C1CO1')
print descriptors
thermochem = lib.Estimate(descriptors,'thermochem')
print thermochem.eval_ND_H(298.15)

Out:
defaultdict(<type 'int'>, {'O(C)2': 1, 'C(C)(H)2(O)': 2, 'C1CO1': 1})
-19.9132141547

Using the package

The package usage for user is very easy. The work flow follows:

  • Loading the scheme
  • Get groups for a SMILES string provided
  • estimate thermodynamic properties

Loading the package requires you to not only load up library, but also loading up VGA.ThermoChem:

from VGA.GroupAdd.Library import GroupLibrary
import VGA.ThermoChem

importing VGA.ThermoChem append a ThermoChem module to group additivity library module that is used to estimate thermodynamic properties. The code won't work if VGA.ThermoChem is not loaded. (see below)

Next, we load up the scheme:

lib = GroupLibrary.Load('benson')

Group additivity schemes are stored in ./VlachosGroupAdditivity/data folder. When you load 'benson' like above, it loads up all the scheme info that is specified in the ./VlachosGroupAdditivity/data/benson folder. Currently we only have classic benson gas group additivity and Salciccioli et al. surface group additivity scheme. More schemes will be added. Also see below if you are interested in creating your own scheme.

Then, we get descriptors and estimate thermo. properties.

descriptors = lib.GetDescriptors('C1CO1')
thermochem = lib.Estimate(descriptors,'thermochem')
print thermochem.eval_ND_H(298.15)

GetDescriptors command breaks down a given molecular graph into groups and corrections. Then the library object also has a functionality to estimate thermochemical properties. When VGA.ThermoChem is imported, the ThemoChem object is svaed within the group additivity library module. By specifying 'thermochem' in the second input, you are asking the software to use ThermoChem object for property estimation. Then, the created objective can be used to estimate thermochemical properties at given temperature. thermochem module also has eval_ND_Cp and eval_ND_S definition to estimate heat capacity and entropy.

Scheme

This section explains the structure of the schemes stored in ./VlachosGroupAdditivity/data folder as well as code flow of the loading. When the VGA.GroupAdd.Library.Load is called, it first see if there is a matching folder name in the scheme folder, and, if not found, treat it as a path to a folder that has scheme information. Then, it reads library.yaml and scheme.yaml file that is in the folder.

library.yaml is a data file that contains group thermochemistry information written in yaml format, and often has "include" list variable written. The VGA.GroupAdd.Library.Load use yaml_io module which detects this include list, then go through other yaml files specified in "include" list and build scheme objective.

Library.yaml has two components:

  • scheme: this info shows where the corresponding scheme is located
  • contents: This has a list of groups and thermochemistry values. See the existing scheme for the format.

Scheme.yaml contains 4 sub data:

  • patterns: This list variable contains the RING language based pattern description that assign atoms based on how the scheme assigns atoms.
  • pretreatment_rules: This contains a reaction query that is, again, based on the RING language, that the code will apply to the molecular graph before finding groups. Currently, this functionality is not applied yet.
  • remaps: This list contains remapping information of groups. For example, the original group additivity, treat C(C[d])(H)3 group the same as C(C)(H)3.
  • other_descriptors: This list contains any other corrections that doesn't follow the group scheme.

Group assigning mechanism

The mechanism basically go through all the patterns listed in the scheme.yaml and assign the sub-group to each atom in the molecule. For example, the sub-group 'C' is a sp3 hybridized carbon, thus the pattern listed in scheme.yaml has this info, and the code looks for any carbon with sp3 hybridized.

For each atom in molecule, the code assign central sub-group as well as peripheral sub-group to each atom. The difference between these two can be simply explained by an example. Typically, central sub-group of a group is hetero atom; e.g. Carbon, Oxygen. But, Hydrogen is never a central sub-group. Thus while the code assign peripheral sub-group as 'H', it does not assign 'none' for central sub-group, because it's never a central sub-group.

Then, the code go through all the atoms with assigned central sub-group, and looks at the connected atoms' peripheral sub-group and make up a group. See an ethane example below: Here red circle shows the picked central sub-group (CSG), and other red fonts in connected atom represents the picked peripheral sub-group (psg) to make up a group.

The code is smart enough that, if an atom in a molecule matches several pattern, it will give you error and which atom had multiple matching pattern. It also spits error, if an atom does not match any pattern.

Installing latest version of RDKIT in ubuntu

(apt rep. gives old 2013 version)

install dependency

sudo apt-get build-essential python-numpy cmake python-dev sqlite3 libsqlite3-dev libboost-dev libboost-system-dev libboost-thread-dev libboost-serialization-dev libboost-python-dev libboost-regex-dev

install boost-devel (for C++)

sudo apt-get install boost-devel

Building

download the latest version

cd ~/<path>/rdkitreleasexxxxxxx/
mkdir build && cd ./build
cmake ..
make
make install

modify you .bashrc you need :

export RDBASE="/home/<path>/rdkitreleasexxxxxxx"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/<path>/rdkitreleasexxxxxxx/lib"
export PYTHONPATH="$PYTHONPATH:/home/<path>/rdkitreleasexxxxxxx"
export PATH="$PATH:/home/<path>/rdkitreleasexxxxxxx"
Clone this wiki locally