Manifold hypothesis states, that data points in high-dimensional space
Our project provides a method to estimate such n. We plan to create a new evaluation metric based in VAE with interpreted intuition and other metrics based on specific datasets.
In order to identify
- Choose
$m$ , the number of estimations. This hyperparameter stands for the number of Monte Carlo estimations. The larger$m$ , the more accurate the result is. - Draw samples
${x_i}^m_{i=1}$ from$X$ (according to some prior probability density function). - For each
$x_i$ compute$\nabla f(x_i)$ . - Compute the SVD of the matrix:
- Estimate the rank of
$G\approx U_r \Sigma_rV^*_r$ . The rank$r$ of the matrix G is the dimensionality of the active subspace. - Low-dimensional vectors are estimated as
$x_{\mathrm{AS}} = U_r^*x$ .
For further details, look into the book „Active Subspaces: Emerging Ideas in Dimension Reduction for Parameter Studies“ (2015) by Paul Constantine.