Boundary Over-Exploration Hinders Performance for Minimizing a Very Noisy Function #2184

theonlydvr · 2024-01-30T20:52:20Z

theonlydvr
Jan 30, 2024

I've been using Bayesian Optimization with Botorch to identify input parameters that minimize reaction times in a decision-making experiment. Reaction times are very noisy, so numerous trials are required to accurately measure the value of the function. I believe this high degree of noise (relative to the effect of my parameters) is interacting poorly with virtually all the acquisition functions I've tried leading to major over-exploration of the boundaries of the parameter space and poor performance.

To develop a protocol for my actual experiments, I have been running Bayesian Optimization on simulated reaction time data with stereotypical response surfaces for parameters. Two example surfaces are shown below, one more standard and the other with a coincidental minimum on the boundary:

To mimic how the experiment would run in practice, I've divided the simulation into 8 sequential blocks corresponding to 8 attempts at the task. Each attempt consists of 150 trials. In the first attempt, I perform a variant of space filling to uniformly fill the parameter space with samples. In the remaining 7 attempts, I select points by optimizing whatever acquisition function I'm trying to test. Hyperparameters for the underlying gaussian process are only optimized between attempts to account for the time constraints in the actual experiment (3-7s window to select the next parameters). I've also fixed the length scale of the kernel based on pilot data and the observation that there was a tendency to overfit if there were outliers.

Below I've included a couple example fits using the GIBBON acquisition function for optimization (one where it worked very well and the other where it got stuck to the boundaries):

It's pretty clear from the second figure that the acquisition function is exploring very poorly in some conditions.

Things I've tried already:

Using a cylindrical kernel. This did reduce sampling at the boundary but instead led to oversampling of a circular boundary in the middle of the parameter space
Adding reflected observations at the boundaries. This reduced the over sampling but not by much and vastly increases the amount of data in the model which comes with further issues.

Any suggestions would be greatly appreciated and I can provide more information if necessary!

Balandat · 2024-02-05T04:19:37Z

Balandat
Feb 5, 2024
Collaborator

Very interesting problem and great writeup. What is the signal-to-noise ratio in your example? What model are you using? Are you passing in the observation noise level or are you letting the model infer that?

Would you be able to share your simulation code so we can better understand what's going on?

0 replies

theonlydvr · 2024-02-05T22:11:53Z

theonlydvr
Feb 5, 2024
Author

The simulation has a couple steps to it so I think it would be best to describe that in some more detail (I'll also provide the code).

The foundation of the simulation is what's called a reinforcement learning-drift diffusion model (RLDDM). The basic idea is that the model has to choose from two options over the course of a task and will learn which of the options is more valuable depending on reward history. Options are chosen based on the value difference with choices being more consistent and rapid for higher value differences. I fit this model to a previously collected behavioral dataset and have posterior distributions for model parameters. From this prior fit, I know that two parameters (boundary separation and drift rate) are affected by my intervention and lead to a reduction in reaction time when modulated. This is the effect I'm aiming to leverage in my simulation.
For each simulation, I generate a reasonable subject by drawing from the posterior distributions of these parameters and posteriors for effects on boundary separation and drift rate. I then generate simulated surface plots for my two intervention parameters by randomly summing together a few sinusoids with length scales resembling some pilot data I collected. The amplitude of this surface is then scaled to match the drawn effect on boundary separation and drift rate.
I generated 100 simulated subjects this way and reused them when comparing acquisition functions, so I am providing the file with those examples as well.

These simulated subjects are then used to generate task behavior along with the suggestions from the Bayesian optimization. Due to the structure of this model, the noise does vary somewhat trial to trial (high value difference decisions are faster and lower variance than low value difference ones). This was an effect I was planning to model for in later iterations of this investigation, but I haven't gotten there yet due to the current issue. In the first example, the signal to noise ratio (change in reaction time divided by standard deviation in reaction time) ranges from roughly 0.3-0.5 and for the second it ranges from roughly 0.2-0.3.

For the Gaussian process, I was using a SingleTaskGP and let the model infer the observation noise (I may eventually feed it in when I try approximating the trial-to-trial correlation). In case it's relevant, I did log transform the reaction times to make the distributions closer to normal (rather than the standard gamma-like shape they normally have).

For the code I've attached (running the master2 script), the working optimization was from simulation 0 and the failed one was from simulation 33. The ground-truth plots can be produced using the plot_ground_truth function and the sample and model overlays can be made using the commented code in the simulation loop.

Let me know if something is missing or unclear, I appreciate any insight!

SimulatedOptimization.zip

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boundary Over-Exploration Hinders Performance for Minimizing a Very Noisy Function #2184

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Boundary Over-Exploration Hinders Performance for Minimizing a Very Noisy Function #2184

theonlydvr Jan 30, 2024

Replies: 2 comments

Balandat Feb 5, 2024 Collaborator

theonlydvr Feb 5, 2024 Author

theonlydvr
Jan 30, 2024

Balandat
Feb 5, 2024
Collaborator

theonlydvr
Feb 5, 2024
Author