- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Structure #15
Conversation
ModelOrderReduction using Data-Driven methods often requires some data reduction techniques to begin with. As a result, it is sensible to abstract this out. PCA which is used in Galerkin based POD is one linear method, that Chris already refers to in Issue 2. Non-linear data reduction methods like the Diffussion Maps is also something that is in the pipeline. Ofcourse, VAEs are well known to represent a state vector in latent space, so this is a no-brainer, as there are some interesting model reduction methods, that could benefit from this.
The PCA routine is precompiled and tested with the Lorenz attractor. It is to be determined, what qualifies as a good test for PCA and other reduction routines for that matter.
Allow the PCA type to include Matrix{FT} as well.
Debug PCA reduce! implementation. Add reduce! to the export list.
Added prelim test for the reduction of the Lorenz time series. I know apriori that the energy of the system after reduction is >0.9. A more systematic test setup has to be thought up. Adding visualization of the data overlayed with the reduced basis would be nice!
src/ModelOrderReduction.jl
Outdated
include("ErrorHandle.jl") | ||
using LinearAlgebra | ||
include("DataReduction/PCA.jl") | ||
include("DataReduction/DifussionMaps.jl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
Codecov Report
@@ Coverage Diff @@
## main #15 +/- ##
=========================================
+ Coverage 0 72.34% +72.34%
=========================================
Files 0 3 +3
Lines 0 47 +47
=========================================
+ Hits 0 34 +34
- Misses 0 13 +13
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
src/DataReduction/PCA.jl
Outdated
new{eltype(snaps[1]),typeof(nmodes)}(snaps,nmodes) | ||
end | ||
|
||
function PCA(snaps::Matrix{FT},nmodes::IT) where {FT,IT} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Among all the names for the SVD, POD is by far the most common in the realm of model order reduction. I would suggest we go with that (if not all).
src/DataReduction/PCA.jl
Outdated
if typeof(pca.snapshots) == Vector{Vector{FT}} | ||
op_matrix = matricize(pca.snapshots) | ||
end | ||
u,s,v = svd(op_matrix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In cases where applying POD (model order reduction in general) makes sense, taking the FULL SVD of the whole data matrix is often prohibitively expensive. So I think we should default to using the method of snapshots to compute the dominant singular vectors/values. If the matrix is small it makes hardly a difference. In general it would be good to allow explicit specification of the algorithm to be used. Many options are conceivable: Randomized SVD (see RandomizedLinAlg.jl for example), Lancosz bidiagonalization (see TSVD.jl).
Also I am unsure what you mean by "testing the energy of the implementation"? If you want to test what percentage of the energy of the input data is captured by a reduced basis, I think it is perfectly fine to hardcode the number you would expect. Other than that, I think it is fair to assume correctness of |
Ad-hoc tests are now implemented with the Lorenz attractor. Regression tests with MultivariateStats.jl coming up. Code Formatting to be done. Include a Benchmark from the data in #2.
Considering that we need the reduced bases in Matrix form down the line, for POD based MOR methods, a change in the DS is made.
Is this still WIP? If not, @FHoltorf should take another pass. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is pretty good already. Most importantly, there is a critical mistake in the organization of the snapshot matrix and that really needs fixing. Also a trivial but usually informative lower bound on the rel. energy recovered by truncated SVD routines should be returned. Beyond that I made only some minor suggestions. Good work, thanks!
examples/POD/lorenz.jl
Outdated
end | ||
|
||
sol = lorenz_prob() | ||
solution = Matrix(reduce(hcat,sol.u)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Array(sol) is what you want to be using. You are not "correctly" (to be understood in the usual context of POD) concatenating the data snapshots (see comment on matricize). For a test this is alright but this isn't really how PCA is used in the context of model order reduction.
src/DataReduction/POD.jl
Outdated
@@ -0,0 +1,65 @@ | |||
function matricize(VoV::Vector{Vector{FT}}) where {FT} | |||
Matrix(reduce(hcat,VoV)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You usually want reduce(hcat,VoV)
to get the right snapshot matrix (each column is a snapshot). For context: The POD basis is simply the orthogonalized basis that captures the column space of the best low rank r approximation to the snapshot matrix, i.e., a basis of the r-dimensional subspace that best embeds the time series data given by the snapshots in the Frobenius norm sense.
src/DataReduction/POD.jl
Outdated
op_matrix = matricize(pod.snapshots) | ||
end | ||
u,s,v = tsvd(op_matrix,pod.nmodes) | ||
pod.energy = NaN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to return sum(s)/(sum(s) + (size(op_matrix,1)-pod.nmodes)*s[end])
as trivial lower bound on the rel. energy captured. In cases where using POD makes sense, that will usually be quite informative.
src/DataReduction/POD.jl
Outdated
op_matrix = matricize(pod.snapshots) | ||
end | ||
u,s,v = rsvd(op_matrix,pod.nmodes) | ||
pod.energy = NaN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to return sum(s)/(sum(s) + (size(op_matrix,1)-pod.nmodes)*s[end])
as trivial lower bound on the rel. energy captured. In cases where using POD makes sense, that will usually be quite informative.
src/DataReduction/POD.jl
Outdated
print(io,"POD \n") | ||
print(io,"Reduction Order = ",pod.nmodes,"\n") | ||
print(io,"Snapshot size = (", length(pod.snapshots),",",length(pod.snapshots[1]),")\n") | ||
print(io,"Energy = ", pod.energy,"\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would advocate for renaming to "relative Energy"
src/ModelOrderReduction.jl
Outdated
export SVD, TSVD, RSVD | ||
export POD, reduce!, matricize | ||
#========================Model Reduction========================================# | ||
include("ModelReduction/LiftAndLearn.jl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LiftAndLearn.jl is empty? Perhaps remove for now?
test/DataReduction.jl
Outdated
end | ||
|
||
sol = lorenz_prob() | ||
solution = Matrix(reduce(hcat,sol.u)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as in the example applies
test/runtests.jl
Outdated
# Write your tests here. | ||
end | ||
include("DataReduction.jl") | ||
include("ModelReduction.jl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ModelReduction.jl is empty. Perhaps remove for now?
src/ErrorHandle.jl
Outdated
function errorhandle(data::Matrix{FT},modes::IT) where {FT,IT} | ||
@assert size(data,1)>1 "State vector is expected to be vector valued." | ||
s = size(data,2) | ||
@assert (modes>0)&(modes<s) "Number of modes should be [1,$(s)]." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the error message should read "Number of modes should be in {1, ..., $(s-1)}."
src/DataReduction/POD.jl
Outdated
|
||
mutable struct POD{FT,IT} <: AbstractDRProblem | ||
snapshots::Union{Vector{Vector{FT}},Matrix{FT}} | ||
nmodes::IT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to not force nmodes
to be an integer? Seems like that would be better to catch user mistakes earlier rather than later.
Also in general, I would suggest to make the POD struct parametric in the type of the snapshots. That is less error prone and you also do not need the if statement in the reduce! functions to check whether the data has already been supplied as a matrix or vector-of-vectors. You can simply dispatch on the POD type appropriately.
Currently, all of this is a bit brittle. In particular, it really kind of directly assumes that the data is given in terms of Float64 (note that you require energy to be of the same type as the elements of the snapshots; that can yield some awkward errors later on), but that is not always the case.
nmodes is now forced to be an integer. Silly transpose error fix Update relative energy for truncated SVD algos TODO: Modify type of POD, similar to the type of snapshots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is good to go @ChrisRackauckas.
I still have a few concerns about certain choices but for sake of expediting the process I will simply submit a PR with the suggested changes myself.
Run the formatter and I think this is a fine start. |
I realized that a recurring theme of the conventional model reduction methods is based on linear (PCA) and non-linear data reduction.(Diffusion Maps, VAEs ...). As a result, I started working on the data reduction part of the package, which could be useful for some the new data-driven reduction schemes that have been proposed.
To check, if this direction is , inline with the ideas of the Maintainer, I have included a sample implementation of PCA. It is still unclear if PCA from MultivariateStats.jl should be leveraged in this package. I will make a benchmark and post in the comments soon.
Tests for Reduction schemes, seems "ad-hoc" in nature. For instance, the intro to POD / PCA in #2 has some MATLAB scripts that could be used to test the energy of the implementations. But I have to admit, I am coming up short in trying to find a systematic way to write tests, that could generalize to different data.
Please advise.