-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend gradlogpdf
to MixtureModels
#1827
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1827 +/- ##
==========================================
+ Coverage 85.94% 86.16% +0.22%
==========================================
Files 144 144
Lines 8658 8704 +46
==========================================
+ Hits 7441 7500 +59
+ Misses 1217 1204 -13 ☔ View full report in Codecov by Sentry. |
Relates #1788 |
Oh, I missed that. Sorry. I thought about using |
But I feel this new version is not optimized. It essentially computes the PDF twice. I should unroll the loop. |
But I am not sure about type stability. Even julia> @code_warntype pdf(MixtureModel([Normal(1, 2), Beta(2, 3)], [6/10, 4/10]), 1.0)
MethodInstance for Distributions.pdf(::MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}, ::Float64)
from pdf(d::UnivariateMixture, x::Real) @ Distributions ~/.julia/packages/Distributions/UaWBm/src/mixtures/mixturemodel.jl:362
Arguments
#self#::Core.Const(Distributions.pdf)
d::MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}
x::Float64
Body::Any
1 ─ %1 = Distributions._mixpdf1(d, x)::Any
└── return %1 with @code_warntype Distributions._mixpdf1(MixtureModel([Normal(1, 2), Beta(2, 3)], [6/10, 4/10]), 1.0)
MethodInstance for Distributions._mixpdf1(::MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}, ::Float64)
from _mixpdf1(d::AbstractMixtureModel, x) @ Distributions ~/.julia/packages/Distributions/UaWBm/src/mixtures/mixturemodel.jl:286
Arguments
#self#::Core.Const(Distributions._mixpdf1)
d::MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}
x::Float64
Locals
#607::Distributions.var"#607#609"
#606::Distributions.var"#606#608"{MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}, Float64}
p::Vector{Float64}
Body::Any
1 ─ (p = Distributions.probs(d))
│ %2 = Distributions.:(var"#606#608")::Core.Const(Distributions.var"#606#608")
│ %3 = Core.typeof(d)::Core.Const(MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}})
│ %4 = Core.typeof(x)::Core.Const(Float64)
│ %5 = Core.apply_type(%2, %3, %4)::Core.Const(Distributions.var"#606#608"{MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}, Float64})
│ (#606 = %new(%5, d, x))
│ %7 = #606::Distributions.var"#606#608"{MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}, Float64}
│ (#607 = %new(Distributions.:(var"#607#609")))
│ %9 = #607::Core.Const(Distributions.var"#607#609"())
│ %10 = Distributions.enumerate(p)::Base.Iterators.Enumerate{Vector{Float64}}
│ %11 = Base.Filter(%9, %10)::Base.Iterators.Filter{Distributions.var"#607#609", Base.Iterators.Enumerate{Vector{Float64}}}
│ %12 = Base.Generator(%7, %11)::Base.Generator{Base.Iterators.Filter{Distributions.var"#607#609", Base.Iterators.Enumerate{Vector{Float64}}}, Distributions.var"#606#608"{MixtureModel{Univariate, Continuous, Distribution{Univariate, Continuous}, Categorical{Float64, Vector{Float64}}}, Float64}}
│ %13 = Distributions.sum(%12)::Any
└── return %13 |
Hmm, I thought I had added another comment on benchmark, but maybe the connection broke and it didn't go through. Here it goes again. I also implemented another version with the for loop detaching the indices of both function gradlogpdf(d::UnivariateMixture, x::Real)
ps = probs(d)
cs = components(d)
ps1 = first(ps)
cs1 = first(cs)
pdfx1 = pdf(cs1, x)
pdfx = ps1 * pdfx1
glp = pdfx * gradlogpdf(cs1, x)
if iszero(ps1) || iszero(pdfx)
glp = zero(glp)
end
@inbounds for (psi, csi) in Iterators.drop(zip(ps, cs), 1)
if !iszero(psi)
pdfxi = pdf(csi, x)
if !iszero(pdfxi)
pipdfxi = psi * pdfxi
pdfx += pipdfxi
glp += pipdfxi * gradlogpdf(csi, x)
end
end
end
if !iszero(pdfx) # else glp is already zero
glp /= pdfx
end
return glp
end When mixing distributions of the same type, d1 = MixtureModel([Normal(1, 2), Normal(2, 3), Normal(1.5), Normal(1.0, 0.2)], [0.3, 0.2, 0.3, 0.2])
d2 = MixtureModel([Beta(1, 2), Beta(2, 3), Beta(4, 2)], [3/10, 4/10, 3/10])
d3 = MixtureModel([Normal(1, 2), Beta(2, 3), Exponential(3/2)], [3/10, 4/10, 3/10]) # type unstable
n = 10
m = 20
d4 = MixtureModel([MvNormal(rand(n), rand(n, n) |> A -> (A + A')^2) for _ in 1:m], rand(m) |> w -> w / sum(w))
d5 = MixtureModel([MvNormal([1.0, 2.0], [0.4 0.2; 0.2 0.5]), MvTDist(5., [1., 2.], [1. 0.1; 0.1 1.])], [0.4, 0.6]) # type unstable We get [ Info: d1: gradlogpdf with for loop
75.841 ns (0 allocations: 0 bytes)
[ Info: d1: gradlogpdf with while loop
73.626 ns (0 allocations: 0 bytes)
[ Info: d2: gradlogpdf with for loop
295.552 ns (0 allocations: 0 bytes)
[ Info: d2: gradlogpdf with while loop
296.826 ns (0 allocations: 0 bytes)
[ Info: d3: gradlogpdf with for loop
686.398 ns (19 allocations: 304 bytes)
[ Info: d3: gradlogpdf with while loop
974.368 ns (21 allocations: 368 bytes)
[ Info: d4: gradlogpdf with for loop
33.258 μs (136 allocations: 16.62 KiB)
[ Info: d4: gradlogpdf with while loop
33.233 μs (136 allocations: 16.62 KiB)
[ Info: d5: gradlogpdf with for loop
3.282 μs (26 allocations: 1.42 KiB)
[ Info: d5: gradlogpdf with while loop
3.471 μs (28 allocations: 1.53 KiB) Ok, I think this is it. I will leave it with the while loop for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance difference is a bit surprising since for
loops are rewritten using the iterate
anyway (you can see this e.g. by inspecting the output of @code_lowered
).
My gut feeling is also that one cannot expect too much performance wise in the type unstable case anyway. In this use-case it might be better to work with tuples (which IIRC currently is not supported).
if !iszero(pdfx) # else glp is already zero | ||
glp /= pdfx | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct to return a gradlogpdf
of zero if x
is not in the support of the mixture distribution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered about that, but decided to follow the current behavior already implemented. For example:
julia> insupport(Beta(0.5, 0.5), -1)
false
julia> logpdf(Beta(0.5, 0.5), -1)
-Inf
julia> gradlogpdf(Beta(0.5, 0.5), -1)
0.0
I don't know. If it is constant -Inf
, then the derivative is zero (except that (-Inf) - (-Inf)
is not defined, but what matters is that the rate of change is zero...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most implementations return zero outside the support:
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/beta.jl#L126
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/chi.jl#L99
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/chisq.jl#L94
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/exponential.jl#L97
and so on (Frechet, Gamma, ...), but some (Arcsin and Kumaraswamy) seem to just extend what happens at the edges of the support:
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/arcsine.jl#L97
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/kumaraswamy.jl#L91
while LogitNormal seems to be the only one that yields NaN
:
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/laplace.jl#L101
Meanwhile, Laplace seems to be the only one to error at a singular point:
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/laplace.jl#L101
while Uniform, for example, is zero everywhere including the singular points.
https://github.com/JuliaStats/Distributions.jl/blob/master/src/univariate/continuous/uniform.jl#L86
I think giving an error is a bit drastic, but yielding either zero or NaN (outside the support or at the singular points) seems reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should make this a separate issue and hopefully standardize the behavior.
Co-authored-by: David Widmann <[email protected]>
Concerning I now think the case in which However, if I added some more tests for this edge case with single mixture that will fail. I marked the PR as draft so I can look into that more carefully. |
This extends
gradlogpdf
to MixtureModels, both univariate and multivariate, at least for those whose components havegradlogpdf
implemented.I haven't implemented the inplace and the component-wise methods yet, but this should be a good start.
I should say I am having trouble with the docs, thought. I have added docstrings and added the methods to the docs src, but they are not showing up. I am not sure what to do.