Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement more generic distributions on Manifolds #57

Open
3 of 11 tasks
sethaxen opened this issue Dec 8, 2019 · 16 comments
Open
3 of 11 tasks

Implement more generic distributions on Manifolds #57

sethaxen opened this issue Dec 8, 2019 · 16 comments

Comments

@sethaxen
Copy link
Member

sethaxen commented Dec 8, 2019

This issue narrows the discussion in #3 to implementing distributions on Manifolds. It also continues some discussion from https://discourse.julialang.org/t/rfc-taking-directional-orientational-statistics-seriously/31951 and on the Slack channel.

There are a few generic distributions on manifolds that we'll want, with examples in directional statistics noted where available and not obvious:

  • Dirac: Dirac measure, has 0 density everywhere but at a fixed point.
  • Mixture: alias of Distributions.MixtureModel typed on our supports and variates
  • Uniform: similar to Distributions.Uniform but defined on manifolds.
  • Product: product distribution analogous to Distributions.Product on ProductManifold
  • Power: product distribution on sub-manifolds of the power-manifold
    • Probably need something more general as well for cases where one wants something like Product but on a PowerManifold. Useful for cases like Matrix von Mises-Fisher on the Steifel manifold which can be composed from a product of von Mises-Fisher distributions with different means and concentrations on the spherical submanifolds
  • Projected: distribution on ambient space projected to submanifold
    • e.g. Projected Normal
  • Restricted, Conditioned, or Intersection: distribution on manifold resulting from intersection of a distribution on ambient space to the manifold
    • e.g. Fisher-Bingham which has as special cases von Mises-Fisher and many other directional distributions
  • Retracted: distribution on tangent space at point retracted to manifold
    • e.g. tangential normal (aka Log-normal, but that's confusing)
  • Riemannian??: Similar to Retracted but with adjustment for curvature (name chosen because "Riemannian normal" is the only well-studied example of this I know).
    • e.g. "spherical normal"
  • IsotropicDiffusion or Brownian: Distribution resulting from brownian motion for a fixed time aka solution to the heat equation on the manifold
    • e.g. Spherical brownian motion distribution

There are plenty of Lie group specific ones that we can address when we've finished implementing Lie groups, i.e distributions resulting from quotient, actions, etc.

I've intentionally abbreviated the names above. While descriptive names like we currently have (e.g. ProductPointDistribution) provide clarity, they're a major eyesore if one wants to use the distribution e.g. in PPLs. We could have a Distributions submodule that contains these types so we can drop Distribution from the name, and we could use types to drop Point, etc., but I've not put much thought in this.

It's worth noting that implementing these generic distributions will handle nearly every case covered in the above Discourse post on directional statistics and also make it a lot easier to construct new distributions that haven't to my knowledge been tested (e.g. a Riemannian Cauchy distribution) for research purposes.

@sethaxen
Copy link
Member Author

sethaxen commented Dec 8, 2019

We can also add utilities to aid in implementing custom distributions. As an example, consider Restricted{Sphere,Distributions.MvNormalCanon}. In general, all we can do with this distribution is write down enough of its log density to sample a random variable drawn from it with MCMC, so long as its parameters (mean vector and precision matrix) are fixed. We can't fit its parameters because we don't know its proportionality constant.

However, VonMisesFisher is a special case of this distribution where MvNormalCanon is MvNormalCanon(μ, κ * I) and norm(μ) ≈ 1. In this case, someone has already worked out not only the log density but also efficient sampling algorithms and maximum likelihood estimators. For cases like this, it would be useful to have a type AbstractWrapped that wraps a manifold distribution but forwards the entire interface to the wrapped distribution. Because the user can enforce constraints during construction, they can then implement custom methods where available but still get the sensible fallbacks. So they would implement something like VonMisesFisher{N} <: AbstractWrapped{Sphere{N},Restricted{Sphere{N},Distributions.MvNormalCanon}} and then overload away.

Of course, we could also consider a traits-based approach.

@mateuszbaran
Copy link
Member

A great topic for discussion although I'll probably have more questions than answers here. How do you sample from a general Restricted distribution?

@sethaxen
Copy link
Member Author

sethaxen commented Dec 8, 2019

In general you can exactly sample through rejection sampling. However, if the submanifold looks like a shell like a sphere, this will essentially never produce a valid sample. This works for bounded volumes like a ball though if the density is sufficiently high. But if the distribution in the ambient space has a log density, and you have an MCMC technique for sampling points on the manifold, then you can get a correlated sample. We could have a utility function that designates whether rejection sampling is a suitable default.

@sethaxen
Copy link
Member Author

sethaxen commented Dec 8, 2019

My intuition says that more generally if a constraint that defines a submanifold can be written with an equal sign, then rejection sampling is not applicable.

@mateuszbaran
Copy link
Member

Right, but almost all manifolds we have are of measure zero in the ambient space. It wouldn't be a particularly useful distribution type here.

@sethaxen
Copy link
Member Author

sethaxen commented Dec 8, 2019

Right, but almost all manifolds we have are of measure zero in the ambient space. It wouldn't be a particularly useful distribution type here.

Yes, that's true for our current manifolds, but not necessarily every manifold we or others will implement. 🙂 And that's only relevant for drawing exact samples. Because we can write down a log density within proportionality, we can still sample from them with MCMC and therefore use them as priors in Bayesian inference using just the defaults. That's powerful enough to include them.

And the most widely used distributions on our existing manifolds are restrictions. e.g. von Mises-Fisher and Bingham on Sphere, Matrix von Mises-Fisher on Rotations.

Each of these generic distributions will have trade-offs. We can exactly sample from Projected but can't write down its log density. We can write down the log density for Riemannian but can't exactly sample from it. We can't write down a log density for Brownian, but we can draw samples via simulation, so long as we know how to diffuse on the manifold (I've not worked out all of the math, but for the cases I'm aware of, you can approximately diffuse without drift using an n-fold convolution of Retracted, with the quality of the approximation increasing with n).

As long as the limitations are clearly documented, I don't see a downside to including them.

@kellertuer
Copy link
Member

I have a short question on this topic: What's the easiest way to generate a random point on a manifold? Should we define a “default distribution” for certain manifolds?

Today I noticed that while rand() just returns a random number rand(M) does not. Is it easy to get something like that? That would be really great (as long as a manifold has something like a default distribution).

@sethaxen
Copy link
Member Author

sethaxen commented Feb 6, 2020

I guess it depends what you mean by "random".

For a compact embedded manifold, the easiest way will be to sample with a multivariate normal in the ambient space and project to the manifold.

For an abstract manifold, it's trickier. If you already have a point, you can sample from an multivariate normal in the tangent space.

But neither of these is probably what one would naively expect when one uses rand(M). If I had to guess, one would assume rand(M) randomly samples from the uniform measure on the manifold, which is only possible for a compact manifold (e.g. one cannot uniformly sample in Euclidean). But "uniform" is vague. For a general Riemannian manifold, the usual "uniform" measure is the area/volume/Hausdorf measure, but for Lie groups, "uniform" probably best corresponds to the Haar measure. There was a discussion about this on slack, and I think we agreed that at least in some cases, the volume measure and Haar measure are the same for compact manifolds, so in those cases, a default may make sense (e.g. for Rotations with the ambient metric, the normalized Haar measure is the same as the volume measure).

All that to say, if the point is to just give the user a point on the manifold, I think a function like arbitrary_point makes more sense. Otherwise, it's safest to require the user to explicitly specify what distribution/measure they want to sample from, as we do now. Defaults are really tricky here, but for manifolds like Sphere, Circle, Rotations, and probably a few others, we can certainly provide a default.

@kellertuer
Copy link
Member

kellertuer commented Feb 6, 2020

Yes, I think I am mainly asking for the ones you mentioned, just ran into that after 5 minutes of starting to port Manopt to use Manifolds. And we can happily do an error for a rand(M) where no such default exists, but mimic otherwise the original rand (or also randn if applicable).

For the real-valued case rand actually samples from a unit cube, but I have not yet a good idea what that might mean on an arbitrary non-compact manifold.

Edit: Just for completeness, When starting an algorithm I was often using some random initialisation (not to be confused with our allocation, really a valid point on M) see for example

I am not saying my old approach should be transferred directly, but it would be neat (if possible for a manifold) to have some default random point generator.

@kellertuer
Copy link
Member

kellertuer commented Feb 8, 2020

First proposal for a default, since I just stumbled upon that:

 rand(M::Sphere) = rand(uniform_distribution(M,zeros(representation_size(M))))

and I am not claiming that this is easy or straightforward, for sure. It would just be nice (for optimization algorithms for example) to just have some point on M to start with (either uniform or from some gaussian).

@sethaxen
Copy link
Member Author

sethaxen commented Apr 1, 2020

For a general Riemannian manifold, the usual "uniform" measure is the area/volume/Hausdorf measure, but for Lie groups, "uniform" probably best corresponds to the Haar measure. There was a discussion about this on slack, and I think we agreed that at least in some cases, the volume measure and Haar measure are the same for compact manifolds, so in those cases, a default may make sense (e.g. for Rotations with the ambient metric, the normalized Haar measure is the same as the volume measure).

My more recent thinking on this is that for now at least, each manifold should have some default uniform measure/distribution, chosen to correspond to the most common notion of uniformity for that manifold, and normalized to be a probability measure whenever possible. This is important for providing logpdfs. Probability density functions formally are the Radon-Nikodym derivatives of a (probability) measure with respect to a base measure, so each logpdf implementation depends on 3 things:

  1. the corresponding distribution
  2. the reference uniform distribution (base measure)
  3. the specific parameterization of points.

We can dispatch on all three of these only if a manifold distribution somehow is typed on its uniform distribution, but then we're departing from how Distributions usually works and moving more towards Measures.jl. In Distributions, every support implicitly assumes a base measure (see JuliaStats/Distributions.jl#224 (comment)), so for consistency, we could just continue that behavior here until something fancier like Measures.jl is complete or there's high demand for different base measures.

For every Uniform distribution then, the logpdf would always return 0, because the density (Radon-Nikodym derivative) is 1 everywhere.

@sethaxen
Copy link
Member Author

sethaxen commented Apr 1, 2020

And one more note, in order to have logpdf implementations, we'll definitely want Uniform distributions to be their own type. e.g. if the uniform distribution can be implemented as a Projected{MvNormal}, that's great, but in general projected distributions won't admit known logpdfs, and we can't dispatch on whether the covariance is an identity matrix to know it's the uniform distribution. Hence the explicit Uniform distribution.

edit: And @kellertuer, then we can get the function you want with the default rand!(rng, M, x) = rand!(rng, Uniform(M), x), which would raise an error if Uniform for that manifold cannot be sampled from (e.g. the Lebesgue measure on ℝⁿ)

@sethaxen
Copy link
Member Author

sethaxen commented Apr 3, 2020

After further thought, I think it makes more sense to explicitly have Hausdorff and Haar{<:ActionDirection} distributions, with uniform_distribution returning the preferred default for a given manifold. Their logpdfs would be defined wrt themselves, i.e. they would always return 0. This would also enable left-Haar and right-Haar distributions to be defined for non-compact manifolds (for compact manifolds, left- and right- Haar coincide, and we just say Haar).

@kellertuer
Copy link
Member

Coming back to distributions (or my rand(M), rand(M,p) ideas - can we do these two functions to some extend? The reason I am asking is – I currently have what I would call a “poor mans variant” of these over at Manopt, see
https://github.com/JuliaManifolds/Manopt.jl/blob/master/src/random.jl
because some algorithms require random init points here and there. Besides depending on the AbstractPower and Powermanifold, this would be a step towards decoupling them – so while I am truly not an expert on randomness and distributions, that would be neat to have.
The tangent randomness I can roughly imagine for sure.

@mateuszbaran
Copy link
Member

I think this is no longer relevant to Manifolds.jl since it's going to be solved in ManifoldMeasures.jl?

@sethaxen
Copy link
Member Author

sethaxen commented Aug 9, 2021

It maybe should be moved to ManifoldMeasures.jl, which is carrying out the plan discussed here, but given how experimental that package is and that it itself could be discontinued, I'd hold off until it is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants