Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fit surrogate model on existing population #158

Open
wants to merge 79 commits into
base: dev
Choose a base branch
from

Conversation

schmoelder
Copy link
Contributor

@schmoelder schmoelder commented Jul 11, 2024

This PR implements a Surrogate class for fitting GPs on existing data from optimizations.

Supersedes #45

To do

Note, there is some WIP in #152 which will also affect this PR. Should we already rebase onto that branch s.t. we can adapt the corresponding interfaces but risk some friction should there be some more changes upstream?

  • Fit GP on existing population
  • Provide method to evaluate objective functions, constraint functions etc.
  • Validate surrogate model accuracy with (simple) test cases
  • Test if optimizers converge to true solution using surrogate

Open questions

Alternative surrogate modeling approches

Currently, only GPs are implemented. However, other surrogate models can be envisioned (e.g. ANNs). To improve a more modular architecture, we could subclasses a SurrogateBase.

Follow-up projects

Once this is merged, we can also start working on other features which would improve or apply the surrogate models. Eventually, these should be moved to their own issues / PRs but for now, this is just a collection of ideas.

The interface

Currently, the SurrogateModel class somewhat mimics an OptimizationProblem as it also provides methods for estimating objectives, nonlinear constraints etc. To demonstrate this, compare the OptimizationProblem

sequenceDiagram
    User->>+OptimizationProblem: evaluate_objectives(x)
    OptimizationProblem->>+User: f(x)
Loading

with the SurrogateModel

sequenceDiagram
    User->>+SurrogateModel: estimate_objectives(x)
    SurrogateModel->>+User: f*(x)
Loading

However, it is important to note that the SurrogateModel will never provide all functionality of the OptimizationProblem, such as specifying variables, constraints etc. Hence, it cannot directly be used as an OptimizationProblem e.g. to interface with an Optimizer.

To me, this means we should rethink the architecture and consider what exactly does the SurrogateModel replace? In the context of an OptimizationProblem, I would say, it actually replaces the evaluation toolchain (that which returns the values x->f/g/m/...).

Consequently, we should consider moving the evaluation toolchain from the OptimizationProblem to its own module (which in the process would also make the OptimizationProblem less of a "god class" and would even allow reusing the toolchain in other places) and introduce an EvaluationInterface.

The architecture would then look something like the following:

sequenceDiagram
    User->>+OptimizationProblem: evaluate_objectives(x)
    OptimizationProblem->>+EvaluationInterface: evaluate(x)
    EvaluationInterface->>+OptimizationProblem: f(x)
    OptimizationProblem->>+User: f(x)
Loading

Where the EvaluationInterface is then implemented by both the EvaluationPipeline (i.e. the current "toolchain") and the SurrogateModel:

classDiagram
    class OptimizationProblem {
        evaluate_objectives(np.ndarray): np.ndarray
    }
    OptimizationProblem "1" *-- "1" EvaluationInterface
    
    class EvaluationInterface {
        <<interface>>
        +evaluate(np.ndarray): np.ndarray
    }
    
    SurrogateModel <|-- EvaluationInterface
    EvaluationPipeline <|-- EvaluationInterface
Loading

Conditioned optimization problems

One of the original ideas for this projects came from optimization problems where we want to fix the value of one of the variables and then run the optimization to find the best point given this value.

For this purpose, we should implement a ConditionedOptimizationProblem which wraps the original OptimizationProblem and provides an interface where the fixed variables are removed. While this is trivial to implement for bound constrained problems, it becomes potentially more complicated for problems with linear and nonlinear constraints.

Plots

For process design, we are often not really interested in just the optimal point but in the general topology of the parameter space. E.g. we are interested in the contours of regions with a given purity. Finely sampling the parameter space would be very expensive so the idea could be to use a surrogate model for this purpose. See also partial dependence plots and #33.

ronald-jaepel and others added 30 commits June 18, 2024 21:13
Previously, there were two interfaces in the `OptimizationProblem` for calling evaluation functions (e.g. objectives): one for evaluating individuals, and one for populations.
To simplify the code base, these two methods were now unified.
To ensure backward compatibility, a 1D-Array is returned if a single individual is passed to the function.
Previously, the cadet path set in Cadet(install_path="path") was not inherited into cadet instances created from the run() method.
Fix and extend tests about .calculate_interstitial_rt/velocity
Previously, this was required because CADET-Core was not setting the
PATH correctly.
Now, this can lead to inconsistent behaviour.
Note, this requires CADET>4.4.0
Recently, the option to (not) plot the time axis using minutes was
introduced.
This commit fixes some methods that were not properly implemented.
Moreover, the name of the flag was changed from `use_minutes` to
`x_axis_in_minutes` to make clear that only plotting is affected and not
other parameter values (e.g. start and end times).
Updates and Fixes plot_at_position by adding start, end and
x_axis_in_minutes as parameters in solution.py
schmoelder and others added 17 commits August 14, 2024 13:38
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: daklauss <[email protected]>
Co-authored-by: Lanzrath, Hannah <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: daklauss <[email protected]>
Co-authored-by: daklauss <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: daklauss <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: Lanzrath, Hannah <[email protected]>
Co-authored-by: daklauss <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: daklauss <[email protected]>
Co-authored-by: Lanzrath, Hannah <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Co-authored-by: Lanzrath, Hannah <[email protected]>
Co-authored-by: Lanzrath, Hannah <[email protected]>
Co-authored-by: Johannes Schmölder <[email protected]>
Use relative imports in tests, to make it compatible with pytest.
Adds create_LWE, a collection of functions to quickly and semi-modularly
set up a load-wash-elude process with CADET-Process

Co-authored-by: Johannes Schmölder <[email protected]>
Adds tests for the LWE process and simulation results

Co-authored-by: Johannes Schmölder <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants