-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial attempt to incorporate MultiTask GPs #353
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jpfolch,
thank you very much for your PR. MultiTask and MultiFidelityGPs are something which I wanted to have in BoFire
for already a long time, but never find time for!
For the start I am focusing on your point three (how to encode the task info), as I have already an idea for this ;)
My proposal is to introduce a new input feature called TaskInput
. This TaskInput
should inherit from the CategoricalInput
feature and should have an additional attribute called fidelities
of type List[int]
in which one can assign different fidelity levels to the individual tasks/categories. I would add a validator that checks that one always has a step size of 1 between different fidelities and that it starts with zero.
For example, having a TaskInput
with three different tasks/categories named process_1
, process_2
and process_3
the following fidelities would be valid:
[0,0,0]
: all have the same fidelity level[0,0,1]
:process_3
has the highest fidelity
Via the allowed
attribute of the TaskInput
one can then define which tasks can be proposed by the optimizer. As the TaskInput
would be a subclass of the CategoricalInput
, one has to setup the encoding for the GP properly. Currently, CategoricalInput
s are alway One Hot encoded, either one gives the option for ordinal encoding for TaskInputs
s or one does still the OneHot encoding and uses OneHotToNumeric
input transform for the TaskInput
as here: https://github.com/experimental-design/bofire/blob/main/bofire/surrogates/mixed_single_task_gp.py which would be from my perspective the most elegant way, but let me know what you think ;)
In addtion, we should add a validator for the Inputs
object to forbid more than one TaskInput
.
What do you think? I will answer your other points over the weekend (hopefully).
Best,
Johannes
bofire/data_models/priors/api.py
Outdated
@@ -25,3 +29,6 @@ | |||
MBO_LENGTHCALE_PRIOR = partial(GammaPrior, concentration=2.0, rate=0.2) | |||
MBO_NOISE_PRIOR = partial(GammaPrior, concentration=2.0, rate=4.0) | |||
MBO_OUTPUTSCALE_PRIOR = partial(GammaPrior, concentration=2.0, rate=4.0) | |||
MBO_LKJ_PRIOR = partial( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for two types here, if they are the same, the two different types are due to historic reasons. You can just create a LKJ_PIOR
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And perhaps we should make there n_task as non-default argument ...
tutorials/multi_task_gp_testing.py
Outdated
@@ -0,0 +1,51 @@ | |||
import json | |||
|
|||
from pydantic import parse_obj_as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should all go into the test suite under botorch.tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the plan is to remove it completely. It just there for now since it reproduces the errors I wrote about.
And I really like the |
Great, I will think in the meantime about the other open questions. |
bofire/data_models/features/tasks.py
Outdated
from bofire.data_models.features.api import DiscreteInput | ||
|
||
|
||
class TaskInput(DiscreteInput): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Short question: why are you basing it on the DiscreteInput
and not on the CategoricalInput
? I would prefer the CategoricalInput
, as it is more expressive and the Task is somehow more a categorical quantity than a discrete one, as the DiscreteInput
always asumes an ordinal ordering of its values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it and here is my logic:
(a) The task themselves do not have any ordering
(b) But the GP expects an ordering of the tasks corresponding to the location of each specific task in the IndexKernel function, i.e. inputs to kernel are k((x', i), (x, j))
so the task input values should always correspond to values i, j in [0, ..., n_tasks - 1]
.
This means that all we need to do to initialize the object InputTasks is to select the number of tasks. I am happy to change this though, and we can just encode the Categorical variables before feeding it to the GP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is now using CategoricalInput
as a base
bofire/data_models/features/tasks.py
Outdated
n_tasks: int | ||
fidelities: List[int] | ||
|
||
@validator("fidelities") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pydantic 1 style, we are using pydantic 2, so you should use field_validator
bofire/data_models/features/tasks.py
Outdated
|
||
class TaskInput(DiscreteInput): | ||
type: Literal["TaskInput"] = "TaskInput" | ||
n_tasks: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n_tasks
is not necessary if you base it on the CategoricalInput as it would be just the number of categories.
I've now got an implementation of
|
merging changes to my develop branch
Hi @jpfolch, as already discussed, you can find a proper implementation for the Regarding your point with the one-hot input transform: I did not know this and this is a pity. Including it into botorch will need some time, but your solution will not work with an optimizer as the optimizer will just call This is implemented here:
The actual encoding taken into account by the model is based on the attribute
For the MultiTaskGP we then have to overwrite it in a way that we assign So, I would recomment to merge main in and change the model to ORDINAL Encoding for the task feature. Note that in standard models, it should still use ONEHOT. You will also need to adjust the input scaler to ignore the task feature, so that it stays as integer. Best, Johannes |
Perfect, thanks! I will get started on fixing the encoding for TaskInputs. Regarding not being able to pickle Index kernels, @TobyBoyne found the bug within GPyTorch and submitted a PR which has been approved, so it should be fixed eventually, but we will need a temporary fix for the next few months at least. |
Hey, while you're waiting for the patch to hit GPyTorch, you can fix this in BoFire by manually registering the prior. I've demonstrated this in TobyBoyne@f9df6b0 , just copy this change over to your code and you will be able to pickle the kernel :) |
Hey @jpfolch, I was curious to know what the botorch guys think about changing the order of operations regarding the input transforms in the Let's see ;) Best, Johannes |
Latest commit added the missing validations for the model, and added testing. Overall the multi-task model seems to work now. Only problem I am running into is that the test |
I have not yet looked at your additions, but do you think this is a bofire or a botorch problem? I will have a look at your changes tmr. |
Hi @jpfolch, this looks good overall. I will do a more thorough review as soon as you request it and you know more about the issue with the fitting. To test serialization and desirilization for the added datamodels (the new gp and the new prior), please add example configs in Best, Johannes |
Sorry, I haven't been able to work on this due to other commitments, but I should have more time now. Regarding the The best solution I have come up with, is to drop the LKJ prior all together until the issue is fixed, and use MultiTaskGPs without a prior on the task covariances. |
Latest commit adds the example configs to |
Thx, will have a look over the course of the week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
thank you very much, looks overall really good. Only some minor things due to tests and validators. If you have problems with the validators, just tell me, then I try to resolve them. I know they can be tricky ...
As soon as this is landed, we have to implement it to the optimizer ;)
Best,
Johannes
tests/bofire/surrogates/test_gps.py
Outdated
input_preprocessing_specs={"task_id": CategoricalEncodingEnum.ONE_HOT}, | ||
) | ||
|
||
# test that if there is no task input, there is an error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can also go the add_invalid
I've made all the requested changes. All tests pass locally :). |
Hi @jpfolch, thank you very much, can you also resolve the merge conflicts by merging the main branch into this one? Best, Johannes |
merge main branch of bofire with my changes
All done, conflicts should be resolved now. I tested, and all tests passed except the EntingStrategy spec, but I don't have entmoot installed which explains it. |
Tests are running through ;), can you fix the linting errors to? |
I think the linting errors should be fixed now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jpfolch,
thank you very much! And sorry for the long process, as I am currently in parental leave, I am not looking every day into github ;)
Next step would be the multifidelity one, or the incorporation into the strategy?
Best,
Johannes
Hi @jduerholt, Thank you very much! I've also been slow as I have been away and juggling other things meanwhile. Happy to see it has been merged. The MultiTaskGP can be used directly to model the fidelities, so I think the next step would be incorporation into the strategy? This is, assuming the strategies are the part of the code where the BO chooses which experiment to do next? Best, |
Hi @jpfolch, now I am back from parental leave and have again more time for BoFire. Yes, incorporation into the strategies would be the next step. The adaptations has to be made here: https://github.com/experimental-design/bofire/blob/main/bofire/strategies/predictives/botorch.py Especially, we have to adjust the dealing with the fixed features, as we currently assume that categoricals are always one hot encoded. Honestly, this whole part of the code needs to be cleaned up. Are you interested in doing it, or should I make a first draft? Best, Johannes |
It would be especially important to know how the optimization over the |
Hi @jduerholt, Sorry for the delay in reply, I have been busy finishing up work my PhD. I am currently writing up, but I should have some time to work on BoFire. So based on my previous work the way I recommend going about optimizing over
Approach (b) would be equivalent to first optimizing any acquisition function in the target fidelity, and then optimizing the task-input only with the MF-MES acquisition function. The benefit of this is that it gives you more flexibility in the choice acquisition functions and it is computationally cheaper than optimizing MF-MES directly (which is another possible route). Let me know which of the two options you would prefer, and I can get started on it. Best, |
Hi @jpfolch, totally forgot this. I will have a detailed look tomorrow. Sorry. Too much to do :( Best, Johannes |
Hi @jpfolch, soory for the delay, the last weeks were crazy. I am not so much into multifidelity optimization, so I have a few questions:
In general, I am always for starting with the easiest option ;) Best, Johannes |
Initial attempt to incorporate MultiTask GPs, with long term plan of incorporating the multi-fidelity algorithm as described in here.
Current issues:
When predicting the posterior, BoFire by default returns the posterior mean and a co-variance which included the observation noise (see this line where
observation_noise = True
). However, BoTorch'sMultiTaskGP
does not support adding observation noise yet and returnsNotImplementedError: Specifying observation noise is not yet supported by MultiTaskGP.
(a) A work-around would be to redefine theMultiTaskGPSurrogate
to have it set toFalse
, however this would lead to inconsistencies in model usage within BoFire. (b) Maybe a new flag can be incorporated into BoFire to include the option of predicting observation noise. (c) For MultiTaskGPs I could implement the new posterior myself, by adding the observation noise to the posterior covariances manually; and then simplify the code once BoTorch starts supporting observation noise.I am unable to use the dump functionality, when saving the BoTorch model using
torch.save(self.model, buffer)
I get the errorAttributeError: Can't pickle local object 'IndexKernel.__init__.<locals>.<lambda>'
. Still not sure if this is an issue with my local code, with GPyTorch's Index Kernel, or something else entirely.Need some way of specifying which column has the task_id for each observation. Currently done by calling the variable 'fid' (for fidelity, but easily changed). Need to decide whether we specify which column has the task_ids by specific name or with some other method.
LKJ Prior (a matrix-valued prior on the inter-task correlations) depends on the number of tasks, which is unknown until the model is initialized. Currently I am setting it as a dummy
n_tasks = 1
in the prior and then changing the attribute value when initializing the surrogate, it works but perhaps there is a better way of going about it.Future work:
Need to incorporate a likelihood with heteroskedastic noise that allows for different noise levels in each task.
Once MultiTaskGPs are working, implement a multi-fidelity strategy for querying.