Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Σ field from ArchimedeanCopula removed in patch release #186

Closed
Vaibhavdixit02 opened this issue Mar 22, 2024 · 18 comments · Fixed by #187
Closed

Σ field from ArchimedeanCopula removed in patch release #186

Vaibhavdixit02 opened this issue Mar 22, 2024 · 18 comments · Fixed by #187
Assignees

Comments

@Vaibhavdixit02
Copy link

https://github.com/SciML/GlobalSensitivity.jl/actions/runs/8392054624/job/22983804323?pr=134#step:6:434 I am confused how to debug this, since there isn't clear how the code can be adapted for this here https://github.com/SciML/GlobalSensitivity.jl/blob/master/src/shapley_sensitivity.jl#L111

@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

So, @Vaibhavdixit02, taking a deeper look, I am not sure if I made a breaking change or if you used internals fields that you should not:

The .\Sigma field of EllipticalCopulas 1) Are not documented and should be considered internals and 2) Are only for EllipticalCopulas. In particular, the IndependenceCopula{d} (alias for ArchimedeanCopula{d, IndependentGenerator}) that is returned when you use a diagonal matrix does not have these internal fields, thus your bug.

The constructor of the Gaussian copula might have evolved to specialize this way, indeed, but I am not sure this is breaking (IDK really)

I think you need a way to subset the dimensions of a SklarDist and/or a Copula. It makes sense for this functionality to belong here rather than in GlobalSensitivity.jl, just have to find a name for it. Should we go for that ?

@lrnv lrnv self-assigned this Mar 23, 2024
lrnv added a commit that referenced this issue Mar 23, 2024
@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

Ok After PR #187 gets merged and released, a new function Copula.subsetdims() will be availiable, and you will be able to delete this method:

https://github.com/SciML/GlobalSensitivity.jl/blob/1f12f72454267d426226f836f511f21c3f5aaa9d/src/shapley_sensitivity.jl#L101C10-L123

and replace this line :

https://github.com/SciML/GlobalSensitivity.jl/blob/1f12f72454267d426226f836f511f21c3f5aaa9d/src/shapley_sensitivity.jl#L225

by

sample_complement = rand(Copulas.subsetdims(input_distribution,idx_minus), n_outer)

@lrnv lrnv closed this as completed in #187 Mar 23, 2024
lrnv added a commit that referenced this issue Mar 23, 2024
@lrnv lrnv reopened this Mar 23, 2024
@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

Released in version 0.1.22. Please tell me if this resolves your issue or not, I can take another look if you want.

@Vaibhavdixit02
Copy link
Author

Thank you for such a detailed follow-up @lrnv, really appreciate it.

The .\Sigma field of EllipticalCopulas 1) Are not documented and should be considered internals and 2) Are only for EllipticalCopulas. In particular, the IndependenceCopula{d} (alias for ArchimedeanCopula{d, IndependentGenerator}) that is returned when you use a diagonal matrix does not have these internal fields, thus your bug.

Ah, since it was a field of a public struct I didn't expect it to be only internal. Your point about this only existing for EllipticalCopulas is accurate, I am actually not sure anymore where the ArchimideanCopula shows up since none of the tests use it and only GaussianCopula, I will have to investigate this better there's probably a default somewhere that's the reason for it.

Ok After PR #187 gets merged and released, a new function Copula.subsetdims() will be availiable, and you will be able to delete this method:

Thanks a lot, this greatly simplifies things!

@Vaibhavdixit02
Copy link
Author

In the tests the identity matrix was being used for the covariance matrix of the Gaussian Copula and that hit https://github.com/lrnv/Copulas.jl/blob/main/src/EllipticalCopulas/GaussianCopula.jl#L40 thus leading to an error with ArchimedeanCopula

@Vaibhavdixit02
Copy link
Author

That branch probably wants to check for all diagonal matrices?

@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

So yes this is where the bug came from : I (silently) changed the output type for diagonal matrices.

In fact since we are dealing with correlation matrices and not covariance matrices, diagonal <=> identity... But now I see that the make_cor! happens after the check, so you are right.

I continue to think this is not breaking, but maybe I'm wrong. I agree that the behavior can be surprising, but it's actually what Distributions.jl is doing when returning an Exponential RV for some specific inputs of the Gamma (IIRC).

Is your problem fixed now with the new release ?

@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

Aaaand maybe I should state a bit more clearly what is API and what is internals in the docs, I never took the time to do that :)

@Vaibhavdixit02
Copy link
Author

So if you do #190 I don't understand how I can handle the code path to not check for the sigma field?

@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

Ok After PR #187 gets merged and released, a new function Copula.subsetdims() will be availiable, and you will be able to delete this method:

https://github.com/SciML/GlobalSensitivity.jl/blob/1f12f72454267d426226f836f511f21c3f5aaa9d/src/shapley_sensitivity.jl#L101C10-L123

and replace this line :

https://github.com/SciML/GlobalSensitivity.jl/blob/1f12f72454267d426226f836f511f21c3f5aaa9d/src/shapley_sensitivity.jl#L225

by

sample_complement = rand(Copulas.subsetdims(input_distribution,idx_minus), n_outer)

Is that what you need ? Otherwise please explain a bit more I am not getting your issue.

@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

Hahaha this one is much more complicated to do generically for any dependence structure.

Basically for GaussianCopula, you can keep what you have (but then restrict the method to appropriate sklardist), and otherwise for other copulas you need #150 to be done.

So for independent copula, conditional sampling is simple subset sampling. Do a Union splitting.

For the moment I think you should only accept gaussian and independent copulas, as this behavior is not implemented for others.

@Vaibhavdixit02
Copy link
Author

But why do I lose the covariance matrix I have used to create the distribution - even if the behavior is more correct now, I think the matrix should remain available somewhere?

@lrnv
Copy link
Owner

lrnv commented Mar 23, 2024

OR, and this is better, wait until Monday and I'll think about the right interface for #150. Then you can keep the généricité and have nice method failure for other copulas directly coming from here...

@Vaibhavdixit02
Copy link
Author


julia> dependency_matrix = Matrix{Int}(I, dim, dim)
4×4 Matrix{Int64}:
 1  0  0  0
 0  1  0  0
 0  0  1  0
 0  0  0  1

julia> C = GaussianCopula(dependency_matrix)
IndependentCopula{4}(
G: Copulas.IndependentGenerator()
)


julia> input_distribution = SklarDist(C, margins)
SklarDist{IndependentCopula{4}, NTuple{4, Uniform{Float64}}}(
C: IndependentCopula{4}(
G: Copulas.IndependentGenerator()
)

m: (Uniform{Float64}(a=1.0, b=5.0), Uniform{Float64}(a=1.0, b=5.0), Uniform{Float64}(a=1.0, b=5.0), Uniform{Float64}(a=1.0, b=5.0))
)


julia> input_distribution.
C
m
julia> input_distribution.C
IndependentCopula{4}(
G: Copulas.IndependentGenerator()
)


julia> input_distribution.C.

A MWE for my usecase haha

Yeah sure I don't mind waiting. Thanks for the help!

@lrnv
Copy link
Owner

lrnv commented Mar 24, 2024

@Vaibhavdixit02 I was not able to push directly to your branch here SciML/GlobalSensitivity.jl#134 so the solutino is below.

I have tried to do something for #150 but it is a large project and I do not have the time right now. So the solution for the moment is as follow:

  1. Restrict current cond_sampling() function to dispatch on ::SklarDist{GaussianCopula{d,TΣ},Tm} where {d,TΣ,Tm} instead of ::SklarDist
  2. Add another method that dispatch on ::SklarDist{Independentcopula{d},Tm} where {d, Tm}

Basically you'll just have to change the current signature

function cond_sampling(distribution::SklarDist,
    n_sample::Int,
    idx::Vector{Int},
    idx_c::Vector{Int},
    x_cond::AbstractArray)

to

function cond_sampling(distribution::SklarDist{GaussianCopula{d,TΣ},Tm},
    n_sample::Int,
    idx::Vector{Int},
    idx_c::Vector{Int},
    x_cond::AbstractArray) where {d,TΣ,Tm}

and add another method :

function cond_sampling(distribution::SklarDist{Independentcopula{d},Tm},
    n_sample::Int,
    idx::Vector{Int},
    idx_c::Vector{Int},
    x_cond::AbstractArray) where {d,Tm}
	# conditional sampling in independent random vector is just subset sampling.
    rand(subsetdims(distribution, idx, n_sample) # you might need to transpose this ?
end

and that would be enough. When #150 has a proper interface, i'll ping you with a PR to enlarge the applicability of Shapley methods to any ::SklarDist, but for the moment the code you have really is restricted to Gaussian dependence structure and independence, so this fix makes sense.

I would keep the gsa signature as it is to have the good MethodError.

@Vaibhavdixit02
Copy link
Author

You can fork the repo and create a PR to my branch?

@lrnv
Copy link
Owner

lrnv commented Mar 25, 2024

Done here SciML/GlobalSensitivity.jl#157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants