Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update composite BO tutorial with HOGP notebook reference #2738

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions tutorials/composite_mtbo/composite_mtbo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"\n",
"**Multi-task Bayesian Optimization** was first proposed by [Swersky et al, NeurIPS, '13](https://papers.neurips.cc/paper/2013/hash/f33ba15effa5c10e873bf3842afb46a6-Abstract.html) in the context of fast hyper-parameter tuning for neural network models; however, we demonstrate a more advanced use-case of **[composite Bayesian optimization](https://proceedings.mlr.press/v97/astudillo19a.html)** where the overall function that we wish to optimize is a cheap-to-evaluate (and known) function of the outputs. In general, we expect that using more information about the function should yield improved performance when attempting to optimize it, particularly if the metric function itself is quickly varying.\n",
"\n",
"See [the composite BO tutorial w/ HOGP](https://github.com/pytorch/botorch/blob/main/tutorials/composite_bo_with_hogp.ipynb) for a more technical introduction. In general, we suggest using MTGPs for unstructured task outputs and the HOGP for matrix / tensor structured outputs.\n",
"See [the composite BO tutorial w/ HOGP](https://github.com/pytorch/botorch/blob/main/tutorials/composite_bo_with_hogp/composite_bo_with_hogp.ipynb) for a more technical introduction. In general, we suggest using MTGPs for unstructured task outputs and the HOGP for matrix / tensor structured outputs.\n",
"\n",
"\n",
"We will use a Multi-Task Gaussian process ([MTGP](https://papers.nips.cc/paper/2007/hash/66368270ffd51418ec58bd793f2d9b1b-Abstract.html)) with an ICM kernel to model all of the outputs in this problem. MTGPs can be easily accessed in Botorch via the `botorch.models.KroneckerMultiTaskGP` model class (for the \"block design\" case of fully observed outputs at all inputs). Given $T$ tasks (outputs) and $n$ data points, they assume that the responses, $Y \\sim \\mathbb{R}^{n \\times T},$ are distributed as $\\text{vec}(Y) \\sim \\mathcal{N}(f, D)$ and $f \\sim \\mathcal{GP}(\\mu_{\\theta}, K_{XX} \\otimes K_{T}),$ where $D$ is a (diagonal) noise term."
Expand Down Expand Up @@ -162,7 +162,7 @@
"$$g(f) = \\sum_{i=1}^T \\cos(f_i^2 + f_i w_i)$$\n",
"where $w$ is a weight vector (drawn randomly once at the start of the optimization). As this function is a non-linear function of the outputs $f,$ we cannot compute acquisition functions via computation of the posterior mean and variance, but rather have to compute posterior samples and evaluate acquisitions with Monte Carlo sampling. \n",
"\n",
"For greater than $10$ or so tasks, it is computationally challenging to sample the posterior over all tasks jointly using conventional approaches, except that [Maddox et al, '21](https://arxiv.org/abs/2106.12997) have devised an efficient method for exploiting the structure in the posterior distribution of the MTGP, enabling efficient MC based optimization of objectives using MTGPs. In this tutorial, we choose 6 contexts/tasks for demostration. "
"For greater than $10$ or so tasks, it is computationally challenging to sample the posterior over all tasks jointly using conventional approaches, except that [Maddox et al, '21](https://arxiv.org/abs/2106.12997) have devised an efficient method for exploiting the structure in the posterior distribution of the MTGP, enabling efficient MC based optimization of objectives using MTGPs. In this tutorial, we choose 6 contexts/tasks for demonstration. "
]
},
{
Expand Down Expand Up @@ -348,7 +348,7 @@
" bounds=bounds,\n",
" q=batch_size,\n",
" num_restarts=10,\n",
" raw_samples=512, # used for intialization heuristic\n",
" raw_samples=512, # used for initialization heuristic\n",
" options={\"batch_limit\": 5, \"maxiter\": MAXITER, \"init_batch_limit\": 5},\n",
" )\n",
" mtgp_train_x = torch.cat((mtgp_train_x, new_mtgp_x), dim=0)\n",
Expand Down