When is KroneckerMultiTaskGP theoretically more useful than SingleTaskGP with multiple outputs? #1003
-
Recently, I've been performing multi-objective Bayesian optimization with a multi-output SingleTaskGP as the surrogate. I noticed this tutorial and have been benchmarking KroneckerMultiTaskGP against SingleTaskGP. In my experiments (with different scalarization functions, qParEGO and qEHVI), KroneckerMultiTaskGP has not provided any significant improvements over SingleTaskGP, and often performs poorly in comparison. Reading the paper linked in the tutorial, I can see how multi-task GPs can provide important additional information for optimization. Showing below a screenshot of one of their figures: In (c.), the multi-task GP is able to give better predictions on the blue task where (1.) it is unobserved and (2.) the green and red tasks are observed. I see how modeling the task correlations allows for this effect. But with KroneckerMultiTaskGP, this model only takes into account the case where all tasks are observed for each input. So there will never be a setup with KroneckerMultiTaskGP where there is information about some tasks and not others. Given this context, in what case is KroneckerMultiTaskGP providing an advantage? I'm having trouble thinking of an example where the correlation matrix would lead to more certain predictions of any individual task when all are observed at all the same inputs. Thanks for any insights -- just looking to better understand the model and its use cases. Since KroneckerMultiTaskGP is more computationally intensive, I really only want to utilize it in cases where it's likely to perform best. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hi, in general, we'd actually expect the In terms of advantages, there's a lot of situations where every outcome is observed at once -- most multi-objective problems and most black-box constrained problems fall into this regime. We actually just published a paper on a bunch of usecases for the Edit: If you have some example code where the |
Beta Was this translation helpful? Give feedback.
Hi, in general, we'd actually expect the
KroneckerMultiTaskGP
to outperform the batched single task GP as it's a super-class of the batched single task GP (the inter-task covariance matrix is just the identity). So, we'd expect greater sample efficiency in terms of outcome modelling than for single task GPs and so hopefully better BO loops.In terms of advantages, there's a lot of situations where every outcome is observed at once -- most multi-objective problems and most black-box constrained problems fall into this regime. We actually just published a paper on a bunch of usecases for the
KroneckerMultiTaskGP
, see here :) .Edit: If you have some example code where the
KroneckerMultitaskGP
…