Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stochastic PDE Functionality #531

Open
zoemcc opened this issue Jun 17, 2022 · 11 comments
Open

Stochastic PDE Functionality #531

zoemcc opened this issue Jun 17, 2022 · 11 comments

Comments

@zoemcc
Copy link
Contributor

zoemcc commented Jun 17, 2022

@hpieper14 @ChrisRackauckas @frankschae @KirillZubov
Design discussions here so that they don't get slurped up by slack.

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Jun 18, 2022

Let's start it with a nice scheme for stochastic ODEs. A nice way to do this would be to use the Kosambi-Karhunen-Loevre theorem for the expansion of the Wiener process:

https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#The_Wiener_process

$$W_t = \sqrt{2} \sum_{k=1}^{\infty} Z_k \frac{\sin((k-\frac{1}{2}) \pi t)}{k-\frac{1}{2} \pi}$$

For $t \in (0,1)$, scale the variable $t$ for changing the time span, where $Z_i$ are independent N(0,1). The convergence to a Brownian process is uniform in L2. So truncate the expansion and take n terms. A finite sample of $n$ terms gives an approximation to the Brownian process $W(t,z_1,\ldots,z_n)$, on which the SDE $dX = f(X,t)dt + g(X,t)dW_t$ can be approximated by the rough path ODE $dX = f(X,t)dt + g(X,t)dW(t,z_1,\ldots,z_n)$, and samples of the SDE can be approximated by taking standard normal $z_i$ and solving the controlled ODE.

Okay, so that's our setup. So now define the PINN solution $u(t,z_1,\ldots,z_n)$ as a neural network with a loss function $$\sum_{t} \Vert \frac{\partial u}{\partial t} - f(X,t) + g(X,t) \frac{\partial W(t,z_1,\ldots,z_n)}{\partial t} \Vert.$$ You can think of this as an n+1 dimensional PDE, where you have t as one dimension and the z_i as the others. Train the PINN to minimize over this n+1 dimensional space.

Now by definition of the KKL expansion theorem, one can generate solutions of the SDE with the correct distribution by choosing $z_i$ as N(0,1) and then solving the rough path ODE. Therefore, once that loss function is minimized, sample solutions to the SDE are obtained as random slices of the neural network, i.e. X = [(t) -> N(t,randn(n)...) for i in 1:100] generates X[i](t) as random samples of the SDE solutions with the correct distributions, and thus the u(t,z_i) is an object that acts as a surrogate to the SDE's solution. QED.

Let's start there. SPDE polynomial chaos expansion is just a harder case of this, and since lots of people use SDEs, this is probably very useful. So what we should do is define a function that takes an SDESystem from ModelingToolkit and generates the appropriate PDESystem which when trained gives the solution u(t,z_i). This can then be checked by solving an like the linear SDE dX = a X_t dt + b X_t dW_t and checking its properties to make sure it's correct. Then handle the multidimensional SDE case, which can be checked against the multi-dimensional linear SDE. Then the full SPDE case is a generalization of that.

@zoemcc
Copy link
Contributor Author

zoemcc commented Jun 18, 2022

Ok I think I mostly understand. Basically you're doing a basis expansion of the Wiener process a la a Fourier expansion and then we solve the SDE in the coefficients of that basis. How big does n typically have to be for this to be effective?

and checking its properties to make sure it's correct.

What properties do you typically check to make sure that the SDE is being solved? That the expectation/variance/higher moments match?

@hpieper14
Copy link

hpieper14 commented Jun 18, 2022

I have the same questions as Zoe about what properties to check to ensure the SDE is being solved. How do we enforce $z_i \sim N(0,1)$ when minimizing the loss to obtain the PINN solution $u(t, z_1, \dots, z_n)$? Is this enforced after obtaining $u(t,z_1, \dots, z_n)$ via the NN and during the sampling to obtain realizations of the SDE solution?

I think other than these questions, I understand our general approach.

I am also not sure how the code should be organized. Within the NeuralPDE ecosystem, there's code for solving the Kolmogorov equation (kolmogorov_solve.jl, param_kolmogorov_solve.jl), but not a general file for SDEs. Should I create a file called something like 'pinns_sde_solve.jl` for this code or is there somewhere else it should live?

@ChrisRackauckas
Copy link
Member

You don't need to enforce it during the solution process. You just need to make sure that for any choice of $z_i$, you get a valid definition of a possible forcing function, and the equation is solved correctly for that forcing function (by minimizing the loss function). Then Brownian motion is defined as a sample from that distribution of possible forcing functions, so then you just have to choose a forcing function with the right probability and you get representations of the solution with the right distribution.

What properties do you typically check to make sure that the SDE is being solved? That the expectation/variance/higher moments match?

Mean/var and its evolution is probably good enough.

Within the NeuralPDE ecosystem, there's code for solving the Kolmogorov equation (kolmogorov_solve.jl, param_kolmogorov_solve.jl), but not a general file for SDEs. Should I create a file called something like 'pinns_sde_solve.jl` for this code or is there somewhere else it should live?

Yeah give it its own script. Note that some of those extra equation solvers are getting moved to https://github.com/SciML/HighDimPDE.jl to make this repo all about PINNs.

@zoemcc
Copy link
Contributor Author

zoemcc commented Jun 20, 2022

Does it make sense to do something like stochastic sampling strategy during training but sample from the $N(0, 1)$'s instead of a uniform distribution? Matching the training sampling distribution to the runtime sampling distribution generally improves the sample complexity of training.

As far as code organization goes, I think that having it be in its own file is ok but you might end up having to double dip and reimplement some of the symbolic parsing/analysis functionality from the main file as it isn't very well cleanly separated into independent composable utilities right now. It's something I've been meaning to do for a while now but haven't been able to get the focused time to do. Structurally though it will be very similar to the setup in pinns_pde_solve.jl (define struct for the pinn and then symbolic discretize/discretize to generate loss functions that sample within them). You could probably get everything into the main file if you wanted to similarly to how the integral equation support was grafted in there, but we're already pretty overloaded. We'll probably need to define new symbolic primitives for the stochastic equations unless Chris has already defined them somewhere in the Symbolics ecosystem (similar to integrals or derivatives being symbolically defined).

Mean/var and its evolution is probably good enough.

Sounds good.

@ChrisRackauckas
Copy link
Member

Does it make sense to do something like stochastic sampling strategy during training but sample from the 's instead of a uniform distribution? Matching the training sampling distribution to the runtime sampling distribution generally improves the sample complexity of training.

Yes probably. You can choose distributions as the sampling scheme for

As far as code organization goes, I think that having it be in its own file is ok but you might end up having to double dip and reimplement some of the symbolic parsing/analysis functionality from the main file as it isn't very well cleanly separated into independent composable utilities right now. It's something I've been meaning to do for a while now but haven't been able to get the focused time to do. Structurally though it will be very similar to the setup in pinns_pde_solve.jl (define struct for the pinn and then symbolic discretize/discretize to generate loss functions that sample within them). You could probably get everything into the main file if you wanted to similarly to how the integral equation support was grafted in there, but we're already pretty overloaded. We'll probably need to define new symbolic primitives for the stochastic equations unless Chris has already defined them somewhere in the Symbolics ecosystem (similar to integrals or derivatives being symbolically defined).

The SDESystem format is probably the one to build off for starters here, and it's very different. That won't require any new primitives and can be started without a lot less parsing effort.

@hpieper14
Copy link

I have been working on the method to convert an SDESystem to a PDESystem.

From the docs, my understanding that SDESystem has fields corresponding to the drift/diffusion terms, the independent/dependent variables and parameters. In order to initialize PDESystem, I also need the boundary conditions and the domain. It doesn't seem like this information is a field within SDEProblem. I have written the method to convert an SDEProblem into a PDESystem temporarily instead -- I think this is fine for the 1D case but doesn't generalize that well.

If my understanding of SDESystem is correct, should I open an issue and add domains/boundary conditions as optional fields in SDESystem?

My code isn't ready for review but it's getting there -- is it okay if I PR it as a draft PR?

@ChrisRackauckas
Copy link
Member

If my understanding of SDESystem is correct, should I open an issue and add domains/boundary conditions as optional fields in SDESystem?

You shouldn't need more domains and boundary conditions on a SDESystem. The domain is just the independent variable (time span) and the boundary condition is the initial condition.

My code isn't ready for review but it's getting there -- is it okay if I PR it as a draft PR?

Yes, early and often is best.

Basically, just start simple and build up. If you use SDESystem / SDEProblem, then there's no parsing required. See the similar ODEProblem specialization:

https://github.com/SciML/NeuralPDE.jl/blob/master/src/ode_solve.jl
https://github.com/SciML/NeuralPDE.jl/blob/master/test/NNODE_tests.jl

It is a physics-informed neural network training on ODEProblem, so no symbolic parsing is required. This is probably a good playground for the first half to hone the techniques since that will give a focus on the math and the methods instead of the symbolic parsing.

@ChrisRackauckas
Copy link
Member

I did a major overhaul to update NNODE to serve as a good example:

#539

So let's start this off by doing the same thing except on SDEProblem, with the Wiener expansions. That can be something you can work on getting right in a script and then similarly carry over to the library much easier than hacking things into the symbolic parser etc.

@hpieper14
Copy link

I added a script that solves the linear SDE using only two modes in #566. Here's a plot generated by the script:
2_modes_100_samples

The script needs to be extended so that the number of modes in the expansion is flexible. I'm working on the multi-dimensional version.

@AstitvaAggarwal
Copy link
Contributor

Im taking this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants