Initial Structure #15

dynamic-queries · 2022-06-30T13:27:36Z

I realized that a recurring theme of the conventional model reduction methods is based on linear (PCA) and non-linear data reduction.(Diffusion Maps, VAEs ...). As a result, I started working on the data reduction part of the package, which could be useful for some the new data-driven reduction schemes that have been proposed.

To check, if this direction is , inline with the ideas of the Maintainer, I have included a sample implementation of PCA. It is still unclear if PCA from MultivariateStats.jl should be leveraged in this package. I will make a benchmark and post in the comments soon.

Tests for Reduction schemes, seems "ad-hoc" in nature. For instance, the intro to POD / PCA in #2 has some MATLAB scripts that could be used to test the energy of the implementations. But I have to admit, I am coming up short in trying to find a systematic way to write tests, that could generalize to different data.

Please advise.

ModelOrderReduction using Data-Driven methods often requires some data reduction techniques to begin with. As a result, it is sensible to abstract this out. PCA which is used in Galerkin based POD is one linear method, that Chris already refers to in Issue 2. Non-linear data reduction methods like the Diffussion Maps is also something that is in the pipeline. Ofcourse, VAEs are well known to represent a state vector in latent space, so this is a no-brainer, as there are some interesting model reduction methods, that could benefit from this.

The PCA routine is precompiled and tested with the Lorenz attractor. It is to be determined, what qualifies as a good test for PCA and other reduction routines for that matter.

Allow the PCA type to include Matrix{FT} as well.

Debug PCA reduce! implementation. Add reduce! to the export list.

Added prelim test for the reduction of the Lorenz time series. I know apriori that the energy of the system after reduction is >0.9. A more systematic test setup has to be thought up. Adding visualization of the data overlayed with the reduced basis would be nice!

ChrisRackauckas · 2022-06-30T14:08:45Z

src/ModelOrderReduction.jl

+    include("ErrorHandle.jl")
+    using LinearAlgebra
+    include("DataReduction/PCA.jl")
+    include("DataReduction/DifussionMaps.jl")


codecov · 2022-06-30T14:12:10Z

Codecov Report

Merging #15 (a06a9f7) into main (020f782) will increase coverage by 72.34%.
The diff coverage is 72.34%.

@@            Coverage Diff            @@
##           main      #15       +/-   ##
=========================================
+ Coverage      0   72.34%   +72.34%     
=========================================
  Files         0        3        +3     
  Lines         0       47       +47     
=========================================
+ Hits          0       34       +34     
- Misses        0       13       +13

Impacted Files	Coverage Δ
src/DataReduction/POD.jl	`67.50% <67.50%> (ø)`
src/ErrorHandle.jl	`100.00% <100.00%> (ø)`
src/Types.jl	`100.00% <100.00%> (ø)`

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

FHoltorf · 2022-06-30T14:53:57Z

src/DataReduction/PCA.jl

+        new{eltype(snaps[1]),typeof(nmodes)}(snaps,nmodes)
+    end
+
+    function PCA(snaps::Matrix{FT},nmodes::IT) where {FT,IT}


Among all the names for the SVD, POD is by far the most common in the realm of model order reduction. I would suggest we go with that (if not all).

FHoltorf · 2022-06-30T15:01:03Z

src/DataReduction/PCA.jl

+    if typeof(pca.snapshots) == Vector{Vector{FT}}
+        op_matrix = matricize(pca.snapshots)
+    end
+    u,s,v = svd(op_matrix)


In cases where applying POD (model order reduction in general) makes sense, taking the FULL SVD of the whole data matrix is often prohibitively expensive. So I think we should default to using the method of snapshots to compute the dominant singular vectors/values. If the matrix is small it makes hardly a difference. In general it would be good to allow explicit specification of the algorithm to be used. Many options are conceivable: Randomized SVD (see RandomizedLinAlg.jl for example), Lancosz bidiagonalization (see TSVD.jl).

FHoltorf · 2022-06-30T15:46:43Z

Also I am unsure what you mean by "testing the energy of the implementation"? If you want to test what percentage of the energy of the input data is captured by a reduced basis, I think it is perfectly fine to hardcode the number you would expect. Other than that, I think it is fair to assume correctness of svd, so you can always rely on that as benchmark for any alternative way to compute a truncated SVD (method of snapshots, rSVD, ...). Any other opinions?

Ad-hoc tests are now implemented with the Lorenz attractor. Regression tests with MultivariateStats.jl coming up. Code Formatting to be done. Include a Benchmark from the data in #2.

dynamic-queries · 2022-07-03T04:21:47Z

Regression Tests with MultivariateStats.jl is not straightforward, as there is some pre-processing done which does not really make sense in this package.

Instead a visual inspection can be realized by looking at the Lorenz attractor.

The POD(mode=2) version looks as follows

It is also known that the POD(mode=1) of the attractor resembles its z-trajectory, which is also verified.

Please advise on how to close this request.

UPDATE :
While I await a reply, I shall start implementing the OperatorInference Module of Lift and Learn in a separate branch at my fork. :D

Considering that we need the reduced bases in Matrix form down the line, for POD based MOR methods, a change in the DS is made.

ChrisRackauckas · 2022-07-07T09:55:15Z

Is this still WIP? If not, @FHoltorf should take another pass.

dynamic-queries · 2022-07-07T10:05:33Z

Is this still WIP? If not, @FHoltorf should take another pass.

FHoltorf

I think this is pretty good already. Most importantly, there is a critical mistake in the organization of the snapshot matrix and that really needs fixing. Also a trivial but usually informative lower bound on the rel. energy recovered by truncated SVD routines should be returned. Beyond that I made only some minor suggestions. Good work, thanks!

FHoltorf · 2022-07-07T16:10:05Z

examples/POD/lorenz.jl

+end
+
+sol = lorenz_prob()
+solution = Matrix(reduce(hcat,sol.u)')


Array(sol) is what you want to be using. You are not "correctly" (to be understood in the usual context of POD) concatenating the data snapshots (see comment on matricize). For a test this is alright but this isn't really how PCA is used in the context of model order reduction.

FHoltorf · 2022-07-07T16:12:25Z

src/DataReduction/POD.jl

@@ -0,0 +1,65 @@
+function matricize(VoV::Vector{Vector{FT}}) where {FT}
+    Matrix(reduce(hcat,VoV)')


You usually want reduce(hcat,VoV) to get the right snapshot matrix (each column is a snapshot). For context: The POD basis is simply the orthogonalized basis that captures the column space of the best low rank r approximation to the snapshot matrix, i.e., a basis of the r-dimensional subspace that best embeds the time series data given by the snapshots in the Frobenius norm sense.

FHoltorf · 2022-07-07T16:14:25Z

src/DataReduction/POD.jl

+        op_matrix = matricize(pod.snapshots)
+    end
+    u,s,v = tsvd(op_matrix,pod.nmodes)
+    pod.energy = NaN


It would be good to return sum(s)/(sum(s) + (size(op_matrix,1)-pod.nmodes)*s[end]) as trivial lower bound on the rel. energy captured. In cases where using POD makes sense, that will usually be quite informative.

FHoltorf · 2022-07-07T16:15:20Z

src/DataReduction/POD.jl

+        op_matrix = matricize(pod.snapshots)
+    end
+    u,s,v = rsvd(op_matrix,pod.nmodes)
+    pod.energy = NaN


It would be good to return sum(s)/(sum(s) + (size(op_matrix,1)-pod.nmodes)*s[end]) as trivial lower bound on the rel. energy captured. In cases where using POD makes sense, that will usually be quite informative.

FHoltorf · 2022-07-07T16:16:14Z

src/DataReduction/POD.jl

+    print(io,"POD \n")
+    print(io,"Reduction Order = ",pod.nmodes,"\n")
+    print(io,"Snapshot size = (", length(pod.snapshots),",",length(pod.snapshots[1]),")\n")
+    print(io,"Energy = ", pod.energy,"\n")


I would advocate for renaming to "relative Energy"

FHoltorf · 2022-07-07T16:21:22Z

src/ModelOrderReduction.jl

+    export SVD, TSVD, RSVD
+    export POD, reduce!, matricize
+#========================Model Reduction========================================#
+    include("ModelReduction/LiftAndLearn.jl")


LiftAndLearn.jl is empty? Perhaps remove for now?

FHoltorf · 2022-07-07T16:26:03Z

test/DataReduction.jl

+    end
+
+    sol = lorenz_prob()
+    solution = Matrix(reduce(hcat,sol.u)')


same comment as in the example applies

FHoltorf · 2022-07-07T16:26:27Z

test/runtests.jl

-    # Write your tests here.
-end
+include("DataReduction.jl")
+include("ModelReduction.jl")


ModelReduction.jl is empty. Perhaps remove for now?

FHoltorf · 2022-07-07T16:28:05Z

src/ErrorHandle.jl

+function errorhandle(data::Matrix{FT},modes::IT) where {FT,IT}
+    @assert size(data,1)>1 "State vector is expected to be vector valued."
+    s = size(data,2)
+    @assert (modes>0)&(modes<s) "Number of modes should be [1,$(s)]."


I believe the error message should read "Number of modes should be in {1, ..., $(s-1)}."

FHoltorf · 2022-07-07T16:30:07Z

src/DataReduction/POD.jl

+
+mutable struct POD{FT,IT} <: AbstractDRProblem
+    snapshots::Union{Vector{Vector{FT}},Matrix{FT}}
+    nmodes::IT


Is there any reason to not force nmodes to be an integer? Seems like that would be better to catch user mistakes earlier rather than later.

Also in general, I would suggest to make the POD struct parametric in the type of the snapshots. That is less error prone and you also do not need the if statement in the reduce! functions to check whether the data has already been supplied as a matrix or vector-of-vectors. You can simply dispatch on the POD type appropriately.

Currently, all of this is a bit brittle. In particular, it really kind of directly assumes that the data is given in terms of Float64 (note that you require energy to be of the same type as the elements of the snapshots; that can yield some awkward errors later on), but that is not always the case.

nmodes is now forced to be an integer. Silly transpose error fix Update relative energy for truncated SVD algos TODO: Modify type of POD, similar to the type of snapshots

Main

FHoltorf

This looks like it is good to go @ChrisRackauckas.

I still have a few concerns about certain choices but for sake of expediting the process I will simply submit a PR with the suggested changes myself.

ChrisRackauckas · 2022-07-14T07:43:30Z

Run the formatter and I think this is a fine start.

Rahul Manavalan added 6 commits June 25, 2022 14:35

Fix Constructor.

62f57bf

Allow the PCA type to include Matrix{FT} as well.

Fix broken sections.

9c8aa6e

Debug PCA reduce! implementation. Add reduce! to the export list.

Add initial test

b27977f

Added prelim test for the reduction of the Lorenz time series. I know apriori that the energy of the system after reduction is >0.9. A more systematic test setup has to be thought up. Adding visualization of the data overlayed with the reduced basis would be nice!

Update Project toml

7a6df39

ChrisRackauckas reviewed Jun 30, 2022

View reviewed changes

FHoltorf reviewed Jun 30, 2022

View reviewed changes

Rahul Manavalan added 4 commits July 1, 2022 11:40

Rename PCA->POD

f0ffaa1

Add solver types

2fdf358

Update main

79b0ac1

Tests for SVD solvers interface

3323595

Ad-hoc tests are now implemented with the Lorenz attractor. Regression tests with MultivariateStats.jl coming up. Code Formatting to be done. Include a Benchmark from the data in #2.

Rahul Manavalan added 2 commits July 3, 2022 06:26

Fix Typo

39b4cbb

Reduced Bases VoV->Matrix

9dd802a

Considering that we need the reduced bases in Matrix form down the line, for POD based MOR methods, a change in the DS is made.

dynamic-queries changed the title ~~WIP - Initial Structure~~ Initial Structure Jul 7, 2022

dynamic-queries closed this Jul 7, 2022

dynamic-queries reopened this Jul 7, 2022

FHoltorf suggested changes Jul 7, 2022

View reviewed changes

Rahul Manavalan and others added 6 commits July 12, 2022 10:11

Remove empty files for now.

468e49b

Remove redundant dependencies

d50df45

Fix Exception message

d7e4593

Changes from prev commit.

e6f5599

nmodes is now forced to be an integer. Silly transpose error fix Update relative energy for truncated SVD algos TODO: Modify type of POD, similar to the type of snapshots

Other fixes

c2837a7

Merge pull request #1 from dynamic-queries/Main

11fcb3b

Main

FHoltorf approved these changes Jul 12, 2022

View reviewed changes

dynamic-queries added 2 commits July 14, 2022 23:15

Format at last

857d93a

Added test/Project.toml

a06a9f7

ChrisRackauckas merged commit 4910f4a into SciML:main Jul 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial Structure #15

Initial Structure #15

dynamic-queries commented Jun 30, 2022

ChrisRackauckas Jun 30, 2022

codecov bot commented Jun 30, 2022 •

edited

Loading

FHoltorf Jun 30, 2022

FHoltorf Jun 30, 2022 •

edited

Loading

FHoltorf commented Jun 30, 2022

dynamic-queries commented Jul 3, 2022 •

edited

Loading

ChrisRackauckas commented Jul 7, 2022

dynamic-queries commented Jul 7, 2022

FHoltorf left a comment •

edited

Loading

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf Jul 7, 2022

FHoltorf left a comment

ChrisRackauckas commented Jul 14, 2022

		@@ -0,0 +1,65 @@
		function matricize(VoV::Vector{Vector{FT}}) where {FT}
		Matrix(reduce(hcat,VoV)')

Initial Structure #15

Initial Structure #15

Conversation

dynamic-queries commented Jun 30, 2022

Choose a reason for hiding this comment

codecov bot commented Jun 30, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

FHoltorf Jun 30, 2022 • edited Loading

Choose a reason for hiding this comment

FHoltorf commented Jun 30, 2022

dynamic-queries commented Jul 3, 2022 • edited Loading

ChrisRackauckas commented Jul 7, 2022

dynamic-queries commented Jul 7, 2022

FHoltorf left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FHoltorf left a comment

Choose a reason for hiding this comment

ChrisRackauckas commented Jul 14, 2022

codecov bot commented Jun 30, 2022 •

edited

Loading

FHoltorf Jun 30, 2022 •

edited

Loading

dynamic-queries commented Jul 3, 2022 •

edited

Loading

FHoltorf left a comment •

edited

Loading