Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a tutorial on BO constrained by probability of classification model #2700

Closed
wants to merge 10 commits into from

Conversation

FrankWanger
Copy link
Contributor

Motivation

There is no current tutorial on using the probability pulled from classification results as constraints in acquisition functions. And such an application holds strong interest from BO guided laboratory experimentation. A prior discussion (#725) was formed.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

In the present tutorial we show how to deal with feasibility constraints that are observed alongside the optimization process (referred to as 'outcome constriants' in BoTorch document, or sometimes as 'black-box constraints'). More specifically, the feasibility is modelled by a classification model, followed by feeding this learned probability to the acquisition funtion through the constraint argument in SampleReducingMCAcquisitionFunction. Namely, this is achieved through re-weighting the acquisition function by $\alpha_{\text{acqf-con}}=\mathbb{P}(\text{Constraint satisfied})*\alpha_{\text{acqf}}$. To achieve this, the pulled probability of classification model underwent un-sigmoid function and was inversed to fit into the API (as negative values treated as feasibility).

A 2D syntheic problem of Townsend function was used. For the classification model, we implemented approximate GP with a Bernoulli likelihood. qLogExpectedImprovement was selected as the aquisition function.

Below are the plots of the problem landscape, acquisition function value, constraint probability, and the EI value (before weighting) at different iterations:

At iter=1:
image

At iter=10:
image

At iter=50:
image

The log regret after 50 iterations are plotted against random (sobel).
image

All images can be reproduced by the notebook.

Related PRs

not related to any change of functionality

@facebook-github-bot
Copy link
Contributor

Hi @FrankWanger!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jan 26, 2025
@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Copy link
Contributor

@Balandat Balandat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for putting this up, this is great.

My main comment (see inline) is on how to leverage the probability of feasibility produced by the classification model directly, rather than converting it twice, but that would require some changes to botorch itself.

Other than that mostly cosmetic comments.

I see that the method finds what appears to be the optimum very quickly - is this consistent across runs? If so it may make sense to reduce the number of iterations somewhat to cut down the runtime of the tutorial.

notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
"$$ \n",
"where $t = \\arctan\\left(\\frac{x_1}{x_2}\\right)$\n",
"\n",
"Here, we follow a natural representation where $y_{\\text{con}}=1$ indicates a feasible condition. We will train a classification model to predict the feasibility of the point. Note that in BoTorch's implementation, **negative values** indicate feasibility, thus we need to do conversion later when feeding feasibility into the pipeline.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Here, we follow a natural representation where $y_{\\text{con}}=1$ indicates a feasible condition. We will train a classification model to predict the feasibility of the point. Note that in BoTorch's implementation, **negative values** indicate feasibility, thus we need to do conversion later when feeding feasibility into the pipeline.\n",
"Here, we follow a natural representation where $y_{\\text{con}}=1$ indicates a feasible condition. We will train a classification model to predict the feasibility of the point. Note that in BoTorch's implementation, **negative values** indicate feasibility, thus we need to do conversion later when feeding feasibility into the pipeline.\n",
"Note that we essentially 'throw away' information contained in the value of $y_{\\text{con}}$ by applying a binary mask - this is for illustration purposes as part of this tutorial, in a real-world application we would model the numerical value of $y_{\\text{con}}$ direction and apply the constraint $y_{\\text{con}}>01$ as part of the optimization.\n",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing here that $y_{\text{con}}$ is being used both in defining the numerical value of the constraint, as well as the binary mask value in the classification model. I suggest using different notation for this to avoid confusing the reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I have realised the problem of notation. I wanted to add that in many situations in experiments the numerical value of the constraint is not directly observable. So what we have as the data is only binary outcomes of success or failure - and yes here we applied this binary mask to our synthetic problem to throw away information so that we can simulate what we obtain in lab.

notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
Comment on lines +352 to +366
"def pass_con_unsigmoid(Z, model_con, X=None):\n",
" '''\n",
" pass the constraint to the acquisition function\n",
"\n",
" Note: Botorch does sigmoid transformation for the constraint by default, \n",
" therefore we need to unsigmoid our probability (0-1) to (-inf,inf)\n",
" also we need to invert the probability, where -inf means the constraint is satisfied. Finally,we add 1e-8 to avoid log(0).\n",
" '''\n",
" y_con = Z[...,1] #get the constraint\n",
"\n",
" prob = model_con.likelihood(y_con).probs #obtain the probability of y_con(when constraint satisfied)\n",
" prob_unsigmoid_neg = torch.log(1-prob+1e-8)-torch.log(prob+1e-8) #unsigmoid the probability and invert it to adapt to BoTorch's constraint API\n",
" \n",
" return prob_unsigmoid_neg\n"
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the classification model already produces the probabilities of feasibility, it would be great if we could directly use that in the acquisition function, rather than converting it back first. @SebastianAment do you see any major challenges to just accept an additional "probability_of_feasibility" argument to SampleReducingMCAcquisitionFunction (and possibly in other places) and then just use that in the probability weighting?

Even if there are no issues, getting such a change into botorch would require some eng work so I wouldn't want to block this PR on that. That said, the probability of feasibility conversion is not a standard sigmoid though internally, see https://github.com/pytorch/botorch/blob/main/botorch/utils/objective.py#L178 - ideally for the time being (until we can accept the probability directly) we could apply the actual inverse of what is being applied in botorch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you see any major challenges to just accept an additional "probability_of_feasibility" argument to SampleReducingMCAcquisitionFunction (and possibly in other places) and then just use that in the probability weighting?

That should be pretty straightforward, mainly taking care of appropriate reshaping, since we are usually applying the feasibility weighting on a per-sample basis, and probability_of_feasibility won't share the MC dimension.

Regarding the inversion of the sigmoid, we are currently using a sigmoid with inverse quadratic asymptotic behavior, which could likely be inverted analytically as well, but that will not be necessary once we support this in the acquisition function directly.

notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
notebooks_community/clf_constrained_bo.ipynb Outdated Show resolved Hide resolved
@FrankWanger
Copy link
Contributor Author

Thanks a lot for putting this up, this is great.

My main comment (see inline) is on how to leverage the probability of feasibility produced by the classification model directly, rather than converting it twice, but that would require some changes to botorch itself.

Other than that mostly cosmetic comments.

I see that the method finds what appears to be the optimum very quickly - is this consistent across runs? If so it may make sense to reduce the number of iterations somewhat to cut down the runtime of the tutorial.

Thank you so much! I've addressed most of the formatting issues, there is only one that I am not sure how to remove - the KeOps warnings. I've switched to macOS and it did not help. In terms of the results, yes I can see that it is quite consistent so I have halved the iterations to 25 and slighted added the freq of plots.

@Balandat
Copy link
Contributor

Great. I may just manually strip the output from the notebook source to keep it clean.

I'll get this merged in since it's in great shape already, but still curious to hear @SebastianAment's thoughts on supporting this better in the acquisition functions themselves (which would be a separate PR anyway).

@facebook-github-bot
Copy link
Contributor

@Balandat has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link

codecov bot commented Jan 28, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.98%. Comparing base (2144440) to head (e13322d).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2700   +/-   ##
=======================================
  Coverage   99.98%   99.98%           
=======================================
  Files         202      202           
  Lines       18588    18588           
=======================================
  Hits        18586    18586           
  Misses          2        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@facebook-github-bot
Copy link
Contributor

@Balandat merged this pull request in aeda83a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants