Introduce our work on the Relaxed Indicator Matrix (RIM) Manifold #65

Yuan-Jinghui · 2025-03-03T12:11:53Z

Dear Nicolas Boumal,

Hello! It’s a pleasure to leave a message here. I would like to introduce our work on the Relaxed Indicator Matrix (RIM) Manifold and contribute our open-source code to Manopt.

In clustering and classification, indicator matrices are fundamental concepts, but optimizing them is an NP-hard problem. To address this, machine learning has proposed various relaxation methods:

Relaxation on the Stiefel manifold: F \in { X \mid X^T X = I } , which leads to the well-known spectral clustering algorithm[1].
Relaxation on the single stochastic manifold: F \in { X \mid X 1_c = 1_n, X > 0 }, which derives the famous Fuzzy Kmeans algorithm[2].
Recent progress on the doubly stochastic manifold: ( F \in { X \mid X 1_c = 1_n, X^T 1_n = r, X > 0 }[3].
However, each approach has limitations:

Issue with (1): Slow computation and the need for post-processing (e.g., Kmeans after spectral clustering).
Issue with (2): Unconstrained column sums may lead to empty clusters or imbalanced classifications.
Issue with (3): Requires strong prior knowledge for distribution specification (non-robust) and solving high-dimensional equations (slow).
To address these, we propose the Relaxed Indicator Matrix (RIM) Manifold: F \in { X \mid X 1_c = 1_n, l < X^T 1_n < u, \ X > 0 } which only requires rough estimates of column distributions and generalizes both single and doubly stochastic manifolds. For optimization, RIM enables fast computation (orders of magnitude faster than doubly stochastic methods in experiments with millions of variables) by leveraging the Euclidean metric and Dykstra’s algorithm to efficiently compute geodesics for convergence.

We have open-sourced the code and we will submitt a preprint to arXiv soon (already sent to your email). We hope this work can benefit the open-source community (both machine learning and optimization).

Best regards
Jinghui
[1] Spectral embedded clustering: A framework for in-sample and out-of-sample spectral clustering
[2] Adaptive Fuzzy C-Means with Graph Embedding
[3] Graph Cuts with Arbitrary Size Constraints Through Optimal Transport

Dear Nicolas Boumal, Hello! It’s a pleasure to leave a message here. I would like to introduce our work on the Relaxed Indicator Matrix (RIM) Manifold and contribute our open-source code to Manopt. In clustering and classification, indicator matrices are fundamental concepts, but optimizing them is an NP-hard problem. To address this, machine learning has proposed various relaxation methods: Relaxation on the Stiefel manifold: F \in { X \mid X^T X = I } , which leads to the well-known spectral clustering algorithm[1]. Relaxation on the single stochastic manifold: F \in { X \mid X 1_c = 1_n, X > 0 }, which derives the famous Fuzzy Kmeans algorithm[2]. Recent progress on the doubly stochastic manifold: ( F \in { X \mid X 1_c = 1_n, X^T 1_n = r, X > 0 }[3]. However, each approach has limitations: Issue with (1): Slow computation and the need for post-processing (e.g., Kmeans after spectral clustering). Issue with (2): Unconstrained column sums may lead to empty clusters or imbalanced classifications. Issue with (3): Requires strong prior knowledge for distribution specification (non-robust) and solving high-dimensional equations (slow). To address these, we propose the Relaxed Indicator Matrix (RIM) Manifold: F \in { X \mid X 1_c = 1_n, l < X^T 1_n < u, \ X > 0 } which only requires rough estimates of column distributions and generalizes both single and doubly stochastic manifolds. For optimization, RIM enables fast computation (orders of magnitude faster than doubly stochastic methods in experiments with millions of variables) by leveraging the Euclidean metric and Dykstra’s algorithm to efficiently compute geodesics for convergence. We have open-sourced the code and we will submitt a preprint to arXiv soon (already sent to your email). We hope this work can benefit the open-source community (both machine learning and optimization). Best regards Jinghui [1] Spectral embedded clustering: A framework for in-sample and out-of-sample spectral clustering [2] Adaptive Fuzzy C-Means with Graph Embedding [3] Graph Cuts with Arbitrary Size Constraints Through Optimal Transport

Yuan-Jinghui added 2 commits March 3, 2025 19:58

Add files via upload

88b4d89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce our work on the Relaxed Indicator Matrix (RIM) Manifold #65

Introduce our work on the Relaxed Indicator Matrix (RIM) Manifold #65

Yuan-Jinghui commented Mar 3, 2025

Introduce our work on the Relaxed Indicator Matrix (RIM) Manifold #65

Are you sure you want to change the base?

Introduce our work on the Relaxed Indicator Matrix (RIM) Manifold #65

Conversation

Yuan-Jinghui commented Mar 3, 2025