Skip to content
This repository has been archived by the owner on Mar 2, 2023. It is now read-only.

Latest commit

 

History

History
88 lines (53 loc) · 3.43 KB

README.org

File metadata and controls

88 lines (53 loc) · 3.43 KB

Diffusion Maps and Geometric Harmonics for Python

Overview

The diffusion-maps library for Python provides a fast and accurate implementation of diffusion maps[fn:1] and geometric harmonics[fn:2]. Its speed stems from the use of sparse linear algebra and (optionally) graphics processing units to accelerate computations. The included code routinely solves eigenvalue problems 3 x faster than SciPy using GPUs on matrices with over 200 million non-zero entries.

The package includes a command-line utility for the quick calculation of diffusion maps on data sets.

Some of the features of the diffusion-maps module include:

  • Fast evaluation of distance matrices using nearest neighbors.
  • Fast and accurate computation of eigenvalue/eigenvector pairs using sparse linear algebra.
  • Optional GPU-accelerated sparse linear algebra routines.
  • Optional interface to the ARPACK-NG library.
  • Simple and easily modifiable code.

[fn:1] Coifman, R. R., & Lafon, S. (2006). Diffusion maps. Applied and Computational Harmonic Analysis, 21(1), 5–30. http://doi.org/10.1016/j.acha.2006.04.006

[fn:2] Coifman, R. R., & Lafon, S. (2006). Geometric harmonics: A novel tool for multiscale out-of-sample extension of empirical functions. Applied and Computational Harmonic Analysis, 21(1), 31–52. http://doi.org/10.1016/j.acha.2005.07.005

./geometric-harmonics.png

Prerequisites

The library is implemented in Python 3.5+ and uses NumPy and SciPy. It is recommended to install PyCUDA to enable the GPU-accelerated eigenvalue solver.

The diffusion-maps command can display the resulting diffusion maps using Matplotlib if it is available.

Installation

Use python setup.py install to install on your system or python setup.py install --user for a user-specific installation.

Command-line utility

The diffusion-maps command reads data sets stored in NPY, CSV, or MATLAB’s MAT format. The simplest way to use it is to invoke it as follows:

diffusion-maps DATA-SET.NPY EPSILON-VALUE

There exist parameters to save and visualize different types of results, to specify how many eigenvalue/eigenvector pairs to compute, etc. See the help page displayed by:

diffusion-maps --help

Additional documentation

Sphinx-based API documentation is available in the doc/ folder. Run

make -C doc html

to build the documentation.

License

This code is released under the MIT license. See LICENSE for details.

Citation

If you use this code in publications, please cite it as:

Acknowledgments

The diffusion-maps library has originally been written by Juan M. Bello-Rivas.

Others have further contributed to diffusion-maps by reporting problems, suggesting various improvements, or submitting actual code. Here is a list of these people. Help me keep it complete and exempt of errors.

  • Felix Dietrich,
  • Mahdi Kooshkbaghi,
  • Daniel Lehmberg,
  • Philipp Schuegraf