Skip to content

Commit

Permalink
add local and component-wise prediction rigidities
Browse files Browse the repository at this point in the history
* implemented the LPR and (L)CPR into the metrics module and wrote up the
documentation and tests.
  • Loading branch information
SanggyuChong committed Aug 21, 2023
1 parent 4fa31a6 commit de30917
Show file tree
Hide file tree
Showing 5 changed files with 311 additions and 21 deletions.
4 changes: 2 additions & 2 deletions docs/src/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ Notebook Examples

.. _getting_started-reconstruction:

Reconstruction Measures
-----------------------
Metrics
-------

.. automodule:: skmatter.metrics
:noindex:
Expand Down
18 changes: 16 additions & 2 deletions docs/src/references/metrics.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Reconstruction Measures
=======================
Metrics
=======

.. automodule:: skmatter.metrics

Expand All @@ -26,3 +26,17 @@ Local Reconstruction Error

.. autofunction:: skmatter.metrics.pointwise_local_reconstruction_error
.. autofunction:: skmatter.metrics.local_reconstruction_error

.. _LPR-api:

Local Prediction Rigidity
-------------------------

.. autofunction:: skmatter.metrics.local_prediction_rigidity

.. _CPR-api:

Component-wise Prediction Rigidity
----------------------------------

.. autofunction:: skmatter.metrics.componentwise_prediction_rigidity
57 changes: 40 additions & 17 deletions src/skmatter/metrics/__init__.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,42 @@
"""
This module contains a set of easily-interpretable error measures of the relative
information capacity of feature space `F` with respect to feature space `F'`. The
methods returns a value between 0 and 1, where 0 means that `F` and `F'` are completey
distinct in terms of linearly-decodable information, and where 1 means that `F'` is
contained in `F`. All methods are implemented as the root mean-square error for the
regression of the feature matrix `X_F'` (or sometimes called `Y` in the doc) from `X_F`
(or sometimes called `X` in the doc) for transformations with different constraints
(linear, orthogonal, locally-linear). By default a custom 2-fold cross-validation
:py:class:`skosmo.linear_model.RidgeRegression2FoldCV` is used to ensure the
generalization of the transformation and efficiency of the computation, since we deal
with a multi-target regression problem. Methods were applied to compare different forms
of featurizations through different hyperparameters and induced metrics and kernels
[Goscinski2021]_ .
This module contains a set of metrics that can be used for an enhanced
understanding of your machine learning model.
First are the easily-interpretable error measures of the relative information
capacity of feature space `F` with respect to feature space `F'`. The methods
returns a value between 0 and 1, where 0 means that `F` and `F'` are completey
distinct in terms of linearly-decodable information, and where 1 means that `F'`
is contained in `F`. All methods are implemented as the root mean-square error
for the regression of the feature matrix `X_F'` (or sometimes called `Y` in the
doc) from `X_F` (or sometimes called `X` in the doc) for transformations with
different constraints (linear, orthogonal, locally-linear). By default a custom
2-fold cross-validation :py:class:`skosmo.linear_model.RidgeRegression2FoldCV`
is used to ensure the generalization of the transformation and efficiency of the
computation, since we deal with a multi-target regression problem. Methods were
applied to compare different forms of featurizations through different
hyperparameters and induced metrics and kernels [Goscinski2021]_ .
These reconstruction measures are available:
* :ref:`GRE-api` (GRE) computes the amount of linearly-decodable information
recovered through a global linear reconstruction.
* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global linear
reconstruction.
* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered through
a local linear reconstruction for the k-nearest neighborhood of each sample.
* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global
linear reconstruction.
* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered
through a local linear reconstruction for the k-nearest neighborhood of each
sample.
Next, we offer a set of prediction rigidity metrics, which can be used to
quantify the robustness of the local or component-wise predictions that the
machine learning model has been trained to make, based on the training dataset
composition.
These prediction rigidities are available:
* :ref:`LPR-api` (LPR) computes the local prediction rigidity of a linear or
kernel model.
* :ref:`CPR-api` (CPR) computes the component-wise prediction rigidity of a
linear or kernel model.
"""

from ._reconstruction_measures import (
Expand All @@ -34,6 +50,11 @@
pointwise_local_reconstruction_error,
)

from ._prediction_rigidities import (
local_prediction_rigidity,
componentwise_prediction_rigidity,
)

__all__ = [
"pointwise_global_reconstruction_error",
"global_reconstruction_error",
Expand All @@ -43,4 +64,6 @@
"local_reconstruction_error",
"check_global_reconstruction_measures_input",
"check_local_reconstruction_measures_input",
"local_prediction_rigidity",
"componentwise_prediction_rigidity",
]
208 changes: 208 additions & 0 deletions src/skmatter/metrics/_prediction_rigidities.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
import numpy as np


def local_prediction_rigidity(X_train, X_test, alpha):
r"""Computes the local prediction rigidity (LPR) of a linear or kernel model
trained on a training dataset provided as input, on the local environments
in the test set provided as a separate input. LPR is defined as follows:
.. math::
LPR_{i} = \frac{1}{X_i (X^{T} X + \lambda I)^{-1} X_i^{T}}
The function assumes that the model training is undertaken in a manner where
the global prediction targets are averaged over the number of atoms
appearing in each training structure, and the average feature vector of each
structure is hence used in the regression. This ensures that (1)
regularization strength across structures with different number of atoms is
kept constant per structure during model training, and (2) range of
resulting LPR values are loosely kept between 0 and 1 for the ease of
interpretation. This requires the user to provide the regularizer value that
results from such training procedure. To guarantee valid comparison in the
LPR across different models, feature vectors are scaled by a global factor
based on standard deviation across atomic envs.
If the model is a kernel model, K_train and K_test can be provided in lieu
of X_train and X_test, alnog with the appropriate regularizer for the
trained model.
Parameters
----------
X_train : list of ndarray of shape (n_atoms, n_features)
Training dataset where each training set structure is stored as a
separate ndarray.
X_test : list of ndarray of shape (n_atoms, n_features)
Test dataset where each training set structure is stored as a separate
ndarray.
alpha : float
Regularizer value that the linear/kernel model has been optimized to.
Returns
-------
LPR : list of array of shape (n_atoms)
Local prediction rigidity (LPR) of the test set structures. LPR is
separately stored for each test structure, and hence list length =
n_test_strucs.
rank_diff : int
integer value of the difference between cov matrix dimension and rank
"""

# initialize a StandardFlexibleScaler and fit to train set atom envs
X_atom = np.vstack(X_train)
sfactor = np.sqrt(np.mean(X_atom**2, axis=0).sum())

# prep to build covariance matrix XX, take average feat vecs per struc
X_struc = []
for X_i in X_train:
X_struc.append(np.mean(X_i / sfactor, axis=0))
X_struc = np.vstack(X_struc)

# build XX and obtain Xinv for LPR calculation
XX = X_struc.T @ X_struc
Xprime = XX + alpha * np.eye(XX.shape[0])
rank_diff = X_struc.shape[1] - np.linalg.matrix_rank(Xprime)
Xinv = np.linalg.pinv(Xprime)

# track test set atom indices for output
lens = []
for X in X_test:
lens.append(len(X))
test_idxs = np.cumsum([0] + lens)

# prep and compute LPR
num_test = len(X_test)
X_test = np.vstack(X_test)
atom_count = X_test.shape[0]
LPR_np = np.zeros(X_test.shape[0])

for ai in range(atom_count):
Xi = X_test[ai].reshape(1, -1) / sfactor
LPR_np[ai] = 1 / (Xi @ Xinv @ Xi.T)

# separately store LPR by test struc
LPR = []
for i in range(num_test):
LPR.append(LPR_np[test_idxs[i] : test_idxs[i + 1]])

return LPR, rank_diff


def componentwise_prediction_rigidity(X_train, X_test, alpha, comp_dims):
r"""Computes the component-wise prediction rigidity (CPR) and the local CPR
(LCPR) of a linear or kernel model trained on a training dataset provided as
input, on the local environments in the test set provided as a separate
input. CPR and LCPR are defined as follows:
.. math::
CPR_{A,c} = \frac{1}{X_{A,c} (X^{T} X + \lambda I)^{-1} X_{A,c}^{T}}
.. math::
LCPR_{i,c} = \frac{1}{X_{i,c} (X^{T} X + \lambda I)^{-1} X_{i,c}^{T}}
The function assumes that the feature vectors for the local environments and
structures are built by concatenating the descriptors of different
prediction components together. It also assumes, like the case of LPR, that
model training is undertaken in a manner where the global prediction targets
are averaged over the number of atoms appearing in each training structure,
and the average feature vector of each structure is hence used in the
regression. Likewise, to guarantee valid comparison in the (L)CPR across
different models, feature vectors are scaled by a global factor based on
standard deviation across atomic envs.
If the model is a kernel model, K_train and K_test can be provided in lieu
of X_train and X_test, alnog with the appropriate regularizer for the
trained model. However, in computing the kernels, one must strictly keep the
different components separate, and compute separate kernel blocks for
different prediction components.
Parameters
----------
X_train : list of ndarray of shape (n_atoms, n_features)
Training dataset where each training set structure is stored as a
separate ndarray.
X_test : list of ndarray of shape (n_atoms, n_features)
Test dataset where each training set structure is stored as a separate
ndarray.
alpha : float
Regularizer value that the linear/kernel model has been optimized to.
comp_dims : array of int values
Dimensions of the feature vectors pertaining to each prediction
component.
Returns
-------
CPR : ndarray of shape (n_test_strucs, n_comps)
Component-wise prediction rigidity computed for each prediction
component, pertaining to the entire test structure.
LCPR : list of ndarrays of shape (n_atoms, n_comps)
Local component-wise prediction rigidity of the test set structures.
Values are separately stored for each test structure, and hence list
length = n_test_strucs
rank_diff : int
value of the difference between cov matrix dimension and rank
"""

# initialize a StandardFlexibleScaler and fit to train set atom envs
X_atom = np.vstack(X_train)
sfactor = np.sqrt(np.mean(X_atom**2, axis=0).sum())

# prep to build covariance matrix XX, take average feat vecs per struc
X_struc = []
for X_i in X_train:
X_struc.append(np.mean(X_i / sfactor, axis=0))
X_struc = np.vstack(X_struc)

# build XX and obtain Xinv for LPR calculation
XX = X_struc.T @ X_struc
Xprime = XX + alpha * np.eye(XX.shape[0])
rank_diff = X_struc.shape[1] - np.linalg.matrix_rank(Xprime)
Xinv = np.linalg.pinv(Xprime)

# track test set atom indices for output
lens = []
for X in X_test:
lens.append(len(X))
test_idxs = np.cumsum([0] + lens)

# get struc average feat vecs for test set
X_struc_test = []
for X_i in X_test:
X_struc_test.append(np.mean(X_i / sfactor, axis=0))
X_struc_test = np.vstack(X_struc_test)

# prep and compute CPR and LCPR
num_test = len(X_test)
num_comp = len(comp_dims)
comp_idxs = np.cumsum([0] + comp_dims.tolist())
X_test = np.vstack(X_test)
atom_count = X_test.shape[0]
CPR = np.zeros((num_test, num_comp))
LCPR_np = np.zeros((atom_count, num_comp))

for ci in range(num_comp):
tot_comp_idx = np.arange(comp_dims.sum())
mask = (
(tot_comp_idx >= comp_idxs[ci]) & (tot_comp_idx < comp_idxs[ci + 1])
).astype(float)

for ai in range(atom_count):
Xic = np.multiply(X_test[ai].reshape(1, -1) / sfactor, mask)
LCPR_np[ai, ci] = 1 / (Xic @ Xinv @ Xic.T)

for si in range(len(X_struc_test)):
XAc = np.multiply(X_struc_test[si].reshape(1, -1), mask)
CPR[si, ci] = 1 / (XAc @ Xinv @ XAc.T)

# separately store LCPR by test struc
LCPR = []
for i in range(num_test):
LCPR.append(LCPR_np[test_idxs[i] : test_idxs[i + 1]])

return CPR, LCPR, rank_diff
45 changes: 45 additions & 0 deletions tests/test_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,59 @@
from sklearn.datasets import load_iris
from sklearn.utils import check_random_state, extmath

from skmatter.datasets import load_degenerate_CH4_manifold
from skmatter.metrics import (
componentwise_prediction_rigidity,
global_reconstruction_distortion,
global_reconstruction_error,
local_prediction_rigidity,
local_reconstruction_error,
pointwise_local_reconstruction_error,
)


class PredictionRigidityTests(unittest.TestCase):
@classmethod
def setUpClass(cls):
soap_features = load_degenerate_CH4_manifold().data["SOAP_power_spectrum"]
soap_features = soap_features[:11]
# each structure in CH4 has 5 environmental feature, because there are 5 atoms
# per structure and each atom is one environment
cls.features = [
soap_features[i * 5 : (i + 1) * 5] for i in range(len(soap_features) // 5)
]
# add a single environment structure to check value
cls.features = cls.features + [soap_features[-1:]]
cls.alpha = 1e-8
bi_features = load_degenerate_CH4_manifold().data["SOAP_bispectrum"]
bi_features = bi_features[:11]
comp_features = np.column_stack([soap_features, bi_features])
cls.comp_features = [
comp_features[i * 5 : (i + 1) * 5] for i in range(len(comp_features) // 5)
]
cls.comp_dims = np.array([soap_features.shape[1], bi_features.shape[1]])

def test_local_prediction_rigidity(self):
LPR, rank_diff = local_prediction_rigidity(
self.features, self.features, self.alpha
)
self.assertTrue(
LPR[-1] >= 1,
f"LPR of the single environment structure is incorrectly lower than 1:"
f"LPR = {LPR[-1]}",
)
self.assertTrue(
rank_diff == 0,
f"LPR Covariance matrix rank is not full, with a difference of:"
f"{rank_diff}",
)

def test_componentwise_prediction_rigidity(self):
_CPR, _LCPR, _rank_diff = componentwise_prediction_rigidity(
self.comp_features, self.comp_features, self.alpha, self.comp_dims
)


class ReconstructionMeasuresTests(unittest.TestCase):
@classmethod
def setUpClass(cls):
Expand Down

0 comments on commit de30917

Please sign in to comment.