diff --git a/404.html b/404.html index ece274aee..bb04d9343 100644 --- a/404.html +++ b/404.html @@ -264,7 +264,7 @@ - Introduction + Home diff --git a/basic_examples/index.html b/basic_examples/index.html index 4cfe46f98..067a86b41 100644 --- a/basic_examples/index.html +++ b/basic_examples/index.html @@ -271,7 +271,7 @@ - Introduction + Home diff --git a/data_models_functionals/index.html b/data_models_functionals/index.html index 2e363ae3b..939324a32 100644 --- a/data_models_functionals/index.html +++ b/data_models_functionals/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/design_with_explicit_formula/index.html b/design_with_explicit_formula/index.html index d4d9dbd96..3d7a7605c 100644 --- a/design_with_explicit_formula/index.html +++ b/design_with_explicit_formula/index.html @@ -271,7 +271,7 @@ - Introduction + Home diff --git a/examples/index.html b/examples/index.html index 97a32ce8a..da3ca9681 100644 --- a/examples/index.html +++ b/examples/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/fractional_factorial/index.html b/fractional_factorial/index.html index 5a789bc47..1bc048456 100644 --- a/fractional_factorial/index.html +++ b/fractional_factorial/index.html @@ -271,7 +271,7 @@ - Introduction + Home diff --git a/getting_started/index.html b/getting_started/index.html index 57df27f49..969541440 100644 --- a/getting_started/index.html +++ b/getting_started/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/index.html b/index.html index 9065e8c7d..887347ef7 100644 --- a/index.html +++ b/index.html @@ -114,7 +114,7 @@
BoFire is a framework to define and solve black-box optimization problems. These problems can arise in a number of closely related fields including experimental design, multi-objective optimization and active learning.
diff --git a/install/index.html b/install/index.html index fec8e87a8..9878314ea 100644 --- a/install/index.html +++ b/install/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/nchoosek_constraint/index.html b/nchoosek_constraint/index.html index a488d1386..2f9a83319 100644 --- a/nchoosek_constraint/index.html +++ b/nchoosek_constraint/index.html @@ -271,7 +271,7 @@ - Introduction + Home diff --git a/optimality_criteria/index.html b/optimality_criteria/index.html index 77634740b..0a1e7222e 100644 --- a/optimality_criteria/index.html +++ b/optimality_criteria/index.html @@ -271,7 +271,7 @@ - Introduction + Home diff --git a/ref-constraints/index.html b/ref-constraints/index.html index 76415b9fe..58217bcbc 100644 --- a/ref-constraints/index.html +++ b/ref-constraints/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/ref-domain-util/index.html b/ref-domain-util/index.html index 1c8a14ee7..bbf743bdf 100644 --- a/ref-domain-util/index.html +++ b/ref-domain-util/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/ref-domain/index.html b/ref-domain/index.html index 8f54a0eb3..478ffe081 100644 --- a/ref-domain/index.html +++ b/ref-domain/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/ref-features/index.html b/ref-features/index.html index 64e721596..f627bb0e3 100644 --- a/ref-features/index.html +++ b/ref-features/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/ref-objectives/index.html b/ref-objectives/index.html index 70706edab..d5f55a350 100644 --- a/ref-objectives/index.html +++ b/ref-objectives/index.html @@ -275,7 +275,7 @@ - Introduction + Home diff --git a/ref-utils/index.html b/ref-utils/index.html index 412c807b4..dfd761a8b 100644 --- a/ref-utils/index.html +++ b/ref-utils/index.html @@ -273,7 +273,7 @@ - Introduction + Home diff --git a/search/search_index.json b/search/search_index.json index 30d4d9664..44793bc5e 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Introduction","text":"BoFire is a framework to define and solve black-box optimization problems. These problems can arise in a number of closely related fields including experimental design, multi-objective optimization and active learning.
BoFire problem specifications are json serializable for use in RESTful APIs and are to a large extent agnostic to the specific methods and frameworks in which the problems are solved.
You can find code-examples in the Getting Started section of this document, as well as full worked-out examples of code-usage in the /tutorials section of this repository!
"},{"location":"#experimental-design","title":"Experimental design","text":"In the context of experimental design BoFire allows to define a design space
\\[ \\mathbb{X} = x_1 \\otimes x_2 \\ldots \\otimes x_D \\]where the design parameters may take values depending on their type and domain, e.g.
and a set of equations define additional experimental constraints, e.g.
In the context of multi-objective optimization BoFire allows to define a vector-valued optimization problem
\\[ \\min_{x \\in \\mathbb{X}} s(y(x)) \\]where
Since the objectives are in general conflicting, there is no point \\(x\\) that simultaneously optimizes all objectives. Instead the goal is to find the Pareto front of all optimal compromises.
A decision maker can then explore these compromises to get a deep understanding of the problem and make the best informed decision.
"},{"location":"#bayesian-optimization","title":"Bayesian optimization","text":"In the context of Bayesian optimization we want to simultaneously learn the unknown function \\(y(x)\\) (exploration), while focusing the experimental effort on promising regions (exploitation). This is done by using the experimental data to fit a probabilistic model \\(p(y|x, {data})\\) that estimates the distribution of possible outcomes for \\(y\\). An acquisition function \\(a\\) then formulates the desired trade-off between exploration and exploitation
\\[ \\min_{x \\in \\mathbb{X}} a(s(p_y(x))) \\]and the minimizer \\(x_\\mathrm{opt}\\) of this acquisition function determines the next experiment \\(y(x)\\) to run.
When there are multiple competing objectives, the task is again to find a suitable approximation of the Pareto front.
"},{"location":"#design-of-experiments","title":"Design of Experiments","text":"BoFire can be used to generate optimal experimental designs with respect to various optimality criteria like D-optimality, A-optimality or uniform space filling.
For this, the user specifies a design space and a model formula, then chooses an optimality criterion and the desired number of experiments in the design. The resulting optimization problem is then solved by IPOPT.
The doe subpackage also supports a wide range of constraints on the design space including linear and nonlinear equalities and inequalities as well a (limited) use of NChooseK constraints. The user can provide fixed experiments that will be treated as part of the design but remain fixed during the optimization process. While some of the optimization algorithms support non-continuous design variables, the doe subpackage only supports those that are continuous.
By default IPOPT uses the freely available linear solver MUMPS. For large models choosing a different linear solver (e.g. ma57 from Coin-HSL) can vastly reduce optimization time. A free academic license for Coin-HSL can be obtained here. Instructions on how to install additional linear solvers for IPOPT are given in the IPOPT documentation. For choosing a specific (HSL) linear solver in BoFire you can just pass the name of the solver to find_local_max_ipopt()
with the linear_solver
option together with the library's name in the option hsllib
, e.g.
find_local_max_ipopt(domain, \"fully-quadratic\", ipopt_options={\"linear_solver\":\"ma57\", \"hsllib\":\"libcoinhsl.so\"})\n
"},{"location":"basic_examples/","title":"Basic Examples for the DoE Subpackage","text":"In\u00a0[11]: Copied! import numpy as np\nimport matplotlib.pyplot as plt\nfrom matplotlib.ticker import FormatStrFormatter\n\nfrom bofire.data_models.constraints.api import (\n NonlinearEqualityConstraint,\n NonlinearInequalityConstraint,\n LinearEqualityConstraint,\n LinearInequalityConstraint,\n InterpointEqualityConstraint,\n)\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.features.api import ContinuousInput, ContinuousOutput\nfrom bofire.strategies.doe.design import find_local_max_ipopt\nimport numpy as np import matplotlib.pyplot as plt from matplotlib.ticker import FormatStrFormatter from bofire.data_models.constraints.api import ( NonlinearEqualityConstraint, NonlinearInequalityConstraint, LinearEqualityConstraint, LinearInequalityConstraint, InterpointEqualityConstraint, ) from bofire.data_models.domain.api import Domain from bofire.data_models.features.api import ContinuousInput, ContinuousOutput from bofire.strategies.doe.design import find_local_max_ipopt In\u00a0[12]: Copied!
domain = Domain(\n inputs = [\n ContinuousInput(key=\"x1\", bounds = (0,1)),\n ContinuousInput(key=\"x2\", bounds = (0.1, 1)),\n ContinuousInput(key=\"x3\", bounds = (0, 0.6))\n ],\n outputs = [ContinuousOutput(key=\"y\")],\n constraints = [\n LinearEqualityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=1),\n LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[5,4], rhs=3.9),\n LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[-20,5], rhs=-3)\n ]\n)\n\nd_optimal_design = find_local_max_ipopt(domain, \"linear\", n_experiments=12, ipopt_options={\"disp\":0}).to_numpy().T\ndomain = Domain( inputs = [ ContinuousInput(key=\"x1\", bounds = (0,1)), ContinuousInput(key=\"x2\", bounds = (0.1, 1)), ContinuousInput(key=\"x3\", bounds = (0, 0.6)) ], outputs = [ContinuousOutput(key=\"y\")], constraints = [ LinearEqualityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=1), LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[5,4], rhs=3.9), LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[-20,5], rhs=-3) ] ) d_optimal_design = find_local_max_ipopt(domain, \"linear\", n_experiments=12, ipopt_options={\"disp\":0}).to_numpy().T In\u00a0[13]: Copied!
fig = plt.figure(figsize=((10,10)))\nax = fig.add_subplot(111, projection='3d')\nax.view_init(45, 45)\nax.set_title(\"Linear model\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\nplt.rcParams[\"figure.figsize\"] = (10,8)\n\n#plot feasible polytope\nax.plot(\n xs=[7/10, 3/10, 1/5, 3/10, 7/10],\n ys=[1/10, 3/5, 1/5, 1/10, 1/10],\n zs=[1/5, 1/10, 3/5, 3/5, 1/5],\n linewidth=2\n)\n\n#plot D-optimal solutions\nax.scatter(\n xs=d_optimal_design[0],\n ys=d_optimal_design[1],\n zs=d_optimal_design[2],\n marker=\"o\",\n s=40,\n color=\"orange\",\n label=\"optimal_design solution, 12 points\"\n)\n\nplt.legend()\nfig = plt.figure(figsize=((10,10))) ax = fig.add_subplot(111, projection='3d') ax.view_init(45, 45) ax.set_title(\"Linear model\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") plt.rcParams[\"figure.figsize\"] = (10,8) #plot feasible polytope ax.plot( xs=[7/10, 3/10, 1/5, 3/10, 7/10], ys=[1/10, 3/5, 1/5, 1/10, 1/10], zs=[1/5, 1/10, 3/5, 3/5, 1/5], linewidth=2 ) #plot D-optimal solutions ax.scatter( xs=d_optimal_design[0], ys=d_optimal_design[1], zs=d_optimal_design[2], marker=\"o\", s=40, color=\"orange\", label=\"optimal_design solution, 12 points\" ) plt.legend() Out[13]:
<matplotlib.legend.Legend at 0x2920b6bd0>In\u00a0[14]: Copied!
d_optimal_design = find_local_max_ipopt(domain, \"x1 + x2 + x3 + {x1**2} + {x2**2} + {x3**2} + {x1**3} + {x2**3} + {x3**3} + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3\", n_experiments=12).to_numpy().T\n\nd_opt = np.array([\n [0.7, 0.3, 0.2, 0.3, 0.5902, 0.4098, 0.2702, 0.2279, 0.4118, 0.5738, 0.4211, 0.3360],\n [0.1, 0.6, 0.2, 0.1, 0.2373, 0.4628, 0.4808, 0.3117, 0.1, 0.1, 0.2911, 0.2264],\n [0.2, 0.1, 0.6, 0.6, 0.1725, 0.1274, 0.249, 0.4604, 0.4882, 0.3262, 0.2878, 0.4376],\n]) # values taken from paper\n\n\nfig = plt.figure(figsize=((10,10)))\nax = fig.add_subplot(111, projection='3d')\nax.set_title(\"cubic model\")\nax.view_init(45, 45)\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\nplt.rcParams[\"figure.figsize\"] = (10,8)\n\n#plot feasible polytope\nax.plot(\n xs=[7/10, 3/10, 1/5, 3/10, 7/10],\n ys=[1/10, 3/5, 1/5, 1/10, 1/10],\n zs=[1/5, 1/10, 3/5, 3/5, 1/5],\n linewidth=2\n)\n\n#plot D-optimal solution\nax.scatter(\n xs=d_opt[0],\n ys=d_opt[1],\n zs=d_opt[2],\n marker=\"o\",\n s=40,\n color=\"darkgreen\",\n label=\"D-optimal design, 12 points\"\n)\n\nax.scatter(\n xs=d_optimal_design[0],\n ys=d_optimal_design[1],\n zs=d_optimal_design[2],\n marker=\"o\",\n s=40,\n color=\"orange\",\n label=\"optimal_design solution, 12 points\"\n)\n\nplt.legend()\nd_optimal_design = find_local_max_ipopt(domain, \"x1 + x2 + x3 + {x1**2} + {x2**2} + {x3**2} + {x1**3} + {x2**3} + {x3**3} + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3\", n_experiments=12).to_numpy().T d_opt = np.array([ [0.7, 0.3, 0.2, 0.3, 0.5902, 0.4098, 0.2702, 0.2279, 0.4118, 0.5738, 0.4211, 0.3360], [0.1, 0.6, 0.2, 0.1, 0.2373, 0.4628, 0.4808, 0.3117, 0.1, 0.1, 0.2911, 0.2264], [0.2, 0.1, 0.6, 0.6, 0.1725, 0.1274, 0.249, 0.4604, 0.4882, 0.3262, 0.2878, 0.4376], ]) # values taken from paper fig = plt.figure(figsize=((10,10))) ax = fig.add_subplot(111, projection='3d') ax.set_title(\"cubic model\") ax.view_init(45, 45) ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") plt.rcParams[\"figure.figsize\"] = (10,8) #plot feasible polytope ax.plot( xs=[7/10, 3/10, 1/5, 3/10, 7/10], ys=[1/10, 3/5, 1/5, 1/10, 1/10], zs=[1/5, 1/10, 3/5, 3/5, 1/5], linewidth=2 ) #plot D-optimal solution ax.scatter( xs=d_opt[0], ys=d_opt[1], zs=d_opt[2], marker=\"o\", s=40, color=\"darkgreen\", label=\"D-optimal design, 12 points\" ) ax.scatter( xs=d_optimal_design[0], ys=d_optimal_design[1], zs=d_optimal_design[2], marker=\"o\", s=40, color=\"orange\", label=\"optimal_design solution, 12 points\" ) plt.legend()
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:668: UserWarning: The minimum number of experiments is 17, but the current setting is n_experiments=12.\n warnings.warn(\nOut[14]:
<matplotlib.legend.Legend at 0x29200b010>In\u00a0[15]: Copied!
def plot_results_3d(result, surface_func):\n u, v = np.mgrid[0 : 2 * np.pi : 100j, 0 : np.pi : 80j]\n X = np.cos(u) * np.sin(v)\n Y = np.sin(u) * np.sin(v)\n Z = surface_func(X, Y)\n\n fig = plt.figure(figsize=(8, 8))\n ax = fig.add_subplot(111, projection=\"3d\")\n ax.plot_surface(X, Y, Z, alpha=0.3)\n ax.scatter(\n xs=result[\"x1\"],\n ys=result[\"x2\"],\n zs=result[\"x3\"],\n marker=\"o\",\n s=40,\n color=\"red\",\n )\n ax.set(xlabel=\"x1\", ylabel=\"x2\", zlabel=\"x3\")\n ax.xaxis.set_major_formatter(FormatStrFormatter('%.2f'))\n ax.yaxis.set_major_formatter(FormatStrFormatter('%.2f'))\ndef plot_results_3d(result, surface_func): u, v = np.mgrid[0 : 2 * np.pi : 100j, 0 : np.pi : 80j] X = np.cos(u) * np.sin(v) Y = np.sin(u) * np.sin(v) Z = surface_func(X, Y) fig = plt.figure(figsize=(8, 8)) ax = fig.add_subplot(111, projection=\"3d\") ax.plot_surface(X, Y, Z, alpha=0.3) ax.scatter( xs=result[\"x1\"], ys=result[\"x2\"], zs=result[\"x3\"], marker=\"o\", s=40, color=\"red\", ) ax.set(xlabel=\"x1\", ylabel=\"x2\", zlabel=\"x3\") ax.xaxis.set_major_formatter(FormatStrFormatter('%.2f')) ax.yaxis.set_major_formatter(FormatStrFormatter('%.2f')) In\u00a0[16]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (-1,1)),\n ContinuousInput(key=\"x2\", bounds = (-1,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[NonlinearInequalityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])],\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100, \"disp\":0})\nresult.round(3)\nplot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (-1,1)), ContinuousInput(key=\"x2\", bounds = (-1,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[NonlinearInequalityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])], ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100, \"disp\":0}) result.round(3) plot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:408: UserWarning: Nonlinear constraints were detected. Not all features and checks are supported for this type of constraints. Using them can lead to unexpected behaviour. Please make sure to provide jacobians for nonlinear constraints.\n warnings.warn(\n/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:440: UserWarning: Sampling failed. Falling back to uniform sampling on input domain. Providing a custom sampling strategy compatible with the problem can possibly improve performance.\n warnings.warn(\n
And the same for a design space limited by an elliptical cone $x_1^2 + x_2^2 - x_3 \\leq 0$.
In\u00a0[17]: Copied!domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (-1,1)),\n ContinuousInput(key=\"x2\", bounds = (-1,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[NonlinearInequalityConstraint(expression=\"x1**2 + x2**2 - x3\", features=[\"x1\",\"x2\",\"x3\"])],\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100})\nresult.round(3)\nplot_results_3d(result, surface_func=lambda x1, x2: x1**2 + x2**2)\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (-1,1)), ContinuousInput(key=\"x2\", bounds = (-1,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[NonlinearInequalityConstraint(expression=\"x1**2 + x2**2 - x3\", features=[\"x1\",\"x2\",\"x3\"])], ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}) result.round(3) plot_results_3d(result, surface_func=lambda x1, x2: x1**2 + x2**2)
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:408: UserWarning: Nonlinear constraints were detected. Not all features and checks are supported for this type of constraints. Using them can lead to unexpected behaviour. Please make sure to provide jacobians for nonlinear constraints.\n warnings.warn(\n/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:440: UserWarning: Sampling failed. Falling back to uniform sampling on input domain. Providing a custom sampling strategy compatible with the problem can possibly improve performance.\n warnings.warn(\nIn\u00a0[18]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (-1,1)),\n ContinuousInput(key=\"x2\", bounds = (-1,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[NonlinearEqualityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])],\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100})\nresult.round(3)\nplot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (-1,1)), ContinuousInput(key=\"x2\", bounds = (-1,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[NonlinearEqualityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])], ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}) result.round(3) plot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:408: UserWarning: Nonlinear constraints were detected. Not all features and checks are supported for this type of constraints. Using them can lead to unexpected behaviour. Please make sure to provide jacobians for nonlinear constraints.\n warnings.warn(\n/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:440: UserWarning: Sampling failed. Falling back to uniform sampling on input domain. Providing a custom sampling strategy compatible with the problem can possibly improve performance.\n warnings.warn(\nIn\u00a0[19]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (0,1)),\n ContinuousInput(key=\"x2\", bounds = (0,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[InterpointEqualityConstraint(feature=\"x1\", multiplicity=3)]\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}, n_experiments=12)\nresult.round(3)\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (0,1)), ContinuousInput(key=\"x2\", bounds = (0,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[InterpointEqualityConstraint(feature=\"x1\", multiplicity=3)] ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}, n_experiments=12) result.round(3) Out[19]: x1 x2 x3 exp0 1.0 1.0 1.0 exp1 1.0 1.0 1.0 exp2 1.0 -0.0 -0.0 exp3 -0.0 1.0 1.0 exp4 -0.0 -0.0 -0.0 exp5 -0.0 -0.0 -0.0 exp6 -0.0 1.0 -0.0 exp7 -0.0 -0.0 1.0 exp8 -0.0 -0.0 1.0 exp9 -0.0 -0.0 1.0 exp10 -0.0 1.0 -0.0"},{"location":"basic_examples/#basic-examples-for-the-doe-subpackage","title":"Basic Examples for the DoE Subpackage\u00b6","text":"
The following example has been taken from the paper \"The construction of D- and I-optimal designs for mixture experiments with linear constraints on the components\" by R. Coetzer and L. M. Haines.
"},{"location":"basic_examples/#linear-model","title":"linear model\u00b6","text":""},{"location":"basic_examples/#cubic-model","title":"cubic model\u00b6","text":""},{"location":"basic_examples/#nonlinear-constraints","title":"Nonlinear Constraints\u00b6","text":"IPOPT also supports nonlinear constraints. This notebook shows examples of design optimizations with nonlinear constraints.
"},{"location":"basic_examples/#example-1-design-inside-a-cone-nonlinear-inequality","title":"Example 1: Design inside a cone / nonlinear inequality\u00b6","text":"In the following example we have three design variables. We impose the constraint of all experiments to be contained in the interior of a cone, which corresponds the nonlinear inequality constraint $\\sqrt{x_1^2 + x_2^2} - x_3 \\leq 0$. The optimization is done for a linear model and places the points on the surface of the cone so as to maximize the between them
"},{"location":"basic_examples/#example-2-design-on-the-surface-of-a-cone-nonlinear-equality","title":"Example 2: Design on the surface of a cone / nonlinear equality\u00b6","text":"We can also limit the design space to the surface of a cone, defined by the equality constraint $\\sqrt{x_1^2 + x_2^2} - x_3 = 0$
Note that due to missing sampling methods in opti, the initial points provided to IPOPT don't satisfy the constraints.
"},{"location":"basic_examples/#example-3-batch-constraints","title":"Example 3: Batch constraints\u00b6","text":"Batch constraints can be used to create designs where each set of multiplicity
subsequent experiments have the same value for a certain feature. In the following example we fix the value of the decision variable x1
inside each batch of size 3.
Data models in BoFire hold static data of an optimization problem. These are input and output features as well as constraints making up the domain. They further include possible optimization objectives, acquisition functions, and kernels.
All data models in bofire.data_models
, are specified as pydantic models and inherit from bofire.data_models.base.BaseModel
. These data models can be (de)serialized via .dict()
and .json()
(provided by pydantic). A json schema of each data model can be obtained using .schema()
.
For surrogates and strategies, all functional parts are located in bofire.surrogates
and bofire.strategies
. These functionalities include the ask
and tell
as well as fit
and predict
methods. All class attributes (used by these method) are also removed from the data models. Each functional entity is initialized using the corresponding data model. As an example, consider the following data model of a RandomStrategy
:
import bofire.data_models.domain.api as dm_domain\nimport bofire.data_models.features.api as dm_features\nimport bofire.data_models.strategies.api as dm_strategies\n\nin1 = dm_features.ContinuousInput(key=\"in1\", bounds=(0.0,1.0))\nin2 = dm_features.ContinuousInput(key=\"in2\", bounds=(0.0,2.0))\nin3 = dm_features.ContinuousInput(key=\"in3\", bounds=(0.0,3.0))\n\nout1 = dm_features.ContinuousOutput(key=\"out1\")\n\ninputs = dm_domain.Inputs(features=[in1, in2, in3])\noutputs = dm_domain.Outputs(features=[out1])\nconstraints = dm_domain.Constraints()\n\ndomain = dm_domain.Domain(\n inputs=inputs,\n outputs=outputs,\n constraints=constraints,\n)\n\ndata_model = dm_strategies.RandomStrategy(domain=domain)\n
Such a data model can be (de)serialized as follows:
import json\nfrom pydantic import parse_obj_as\nfrom bofire.data_models.strategies.api import AnyStrategy\n\nserialized = data_model.json()\ndata = json.loads(serialized)\n# alternative: data = data_model.dict()\ndata_model_ = parse_obj_as(AnyStrategy, data)\nassert data_model_ == data_model\n
Using this data model of a strategy, we can create an instance of a (functional) strategy:
import bofire.strategies.api as strategies\nstrategy = strategies.RandomStrategy(data_model=data_model)\n
As each strategy data model should be mapped to a specific (functional) strategy, we provide such a mapping:
strategy = strategies.map(data_model)\n
"},{"location":"design_with_explicit_formula/","title":"Design with explicit Formula","text":"In\u00a0[1]: Copied! from bofire.data_models.api import Domain, Inputs\nfrom bofire.data_models.features.api import ContinuousInput\nfrom bofire.strategies.doe.design import find_local_max_ipopt\nfrom formulaic import Formula\nfrom sklearn.preprocessing import MinMaxScaler\nimport itertools\nimport pandas as pd\nfrom bofire.utils.doe import get_confounding_matrix\nfrom bofire.data_models.api import Domain, Inputs from bofire.data_models.features.api import ContinuousInput from bofire.strategies.doe.design import find_local_max_ipopt from formulaic import Formula from sklearn.preprocessing import MinMaxScaler import itertools import pandas as pd from bofire.utils.doe import get_confounding_matrix
/opt/homebrew/Caskroom/miniforge/base/envs/bofire/lib/python3.10/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n from .autonotebook import tqdm as notebook_tqdm\nIn\u00a0[2]: Copied!
input_features=Inputs(\n features=[\n ContinuousInput(key=\"a\", bounds = (0,5)),\n ContinuousInput(key=\"b\", bounds= (40, 800)),\n ContinuousInput(key=\"c\", bounds= (80,180)),\n ContinuousInput(key=\"d\", bounds = (200,800)),\n ] \n )\ndomain = Domain(inputs=input_features)\ninput_features=Inputs( features=[ ContinuousInput(key=\"a\", bounds = (0,5)), ContinuousInput(key=\"b\", bounds= (40, 800)), ContinuousInput(key=\"c\", bounds= (80,180)), ContinuousInput(key=\"d\", bounds = (200,800)), ] ) domain = Domain(inputs=input_features) In\u00a0[3]: Copied!
model_type = Formula(\"a + {a**2} + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d\")\nmodel_type\nmodel_type = Formula(\"a + {a**2} + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d\") model_type Out[3]:
1 + a + a**2 + b + c + d + a:b + a:c + a:d + b:c + b:d + c:dIn\u00a0[4]: Copied!
design = find_local_max_ipopt(domain=domain, model_type=model_type, n_experiments=17)\ndesign\ndesign = find_local_max_ipopt(domain=domain, model_type=model_type, n_experiments=17) design
\n******************************************************************************\nThis program contains Ipopt, a library for large-scale nonlinear optimization.\n Ipopt is released as open source code under the Eclipse Public License (EPL).\n For more information visit https://github.com/coin-or/Ipopt\n******************************************************************************\n\nOut[4]: a b c d exp0 5.000000e+00 40.000000 180.000002 199.999998 exp1 2.500000e+00 800.000008 79.999999 800.000008 exp2 -9.972222e-09 800.000008 180.000002 199.999998 exp3 5.000000e+00 800.000008 180.000002 800.000008 exp4 -9.975610e-09 40.000000 180.000002 199.999998 exp5 -9.975610e-09 800.000008 180.000002 800.000008 exp6 2.500000e+00 800.000008 180.000002 199.999998 exp7 5.000000e+00 40.000000 79.999999 800.000008 exp8 5.000000e+00 800.000008 79.999999 199.999998 exp9 -9.750000e-09 40.000000 79.999999 199.999998 exp10 -9.975610e-09 800.000008 79.999999 199.999998 exp11 -9.975610e-09 40.000000 79.999999 800.000008 exp12 5.000000e+00 800.000008 79.999999 800.000008 exp13 2.500000e+00 40.000000 180.000002 800.000008 exp14 5.000000e+00 40.000000 79.999999 199.999998 exp15 -9.972222e-09 800.000008 79.999999 800.000008 exp16 5.000000e+00 800.000008 180.000002 199.999998 In\u00a0[6]: Copied!
import matplotlib\nimport seaborn as sns\nimport matplotlib.pyplot as plt\n\nmatplotlib.rcParams[\"figure.dpi\"] = 120\n\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2,3], powers=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nimport matplotlib import seaborn as sns import matplotlib.pyplot as plt matplotlib.rcParams[\"figure.dpi\"] = 120 m = get_confounding_matrix(domain.inputs, design=design, interactions=[2,3], powers=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show() In\u00a0[\u00a0]: Copied!
\n"},{"location":"design_with_explicit_formula/#design-with-explicit-formula","title":"Design with explicit Formula\u00b6","text":"
This tutorial notebook shows how to setup a D-optimal design with BoFire while providing an explicit formula and not just one of the four available keywords linear
, linear-and-interaction
, linear-and-quadratic
, fully-quadratic
.
Make sure that cyipopt
is installed. The recommend way is the installation via conda conda install -c conda-forge cyipopt
.
This is a collection of code examples to allow for an easy exploration of the functionalities that BoFire offers.
"},{"location":"examples/#doe","title":"DoE","text":"import matplotlib.pyplot as plt\nimport pandas as pd\nimport seaborn as sns\n\nimport bofire.strategies.api as strategies\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.features.api import ContinuousInput\nfrom bofire.data_models.strategies.api import FractionalFactorialStrategy\nfrom bofire.utils.doe import get_confounding_matrix, get_generator, get_alias_structure\n\n\ndef plot_design(design: pd.DataFrame):\n # we do a plot with three subplots in one row in which the three degrees of freedom (temperature, time and ph) are plotted\n _, axs = plt.subplots(1, 3, figsize=(15, 5))\n axs[0].scatter(design['temperature'], design['time'])\n axs[0].set_xlabel('Temperature')\n axs[0].set_ylabel('Time')\n axs[1].scatter(design['temperature'], design['ph'])\n axs[1].set_xlabel('Temperature')\n axs[1].set_ylabel('pH')\n axs[2].scatter(design['time'], design['ph'])\n axs[2].set_xlabel('Time')\n axs[2].set_ylabel('pH')\n plt.show()\nimport matplotlib.pyplot as plt import pandas as pd import seaborn as sns import bofire.strategies.api as strategies from bofire.data_models.domain.api import Domain from bofire.data_models.features.api import ContinuousInput from bofire.data_models.strategies.api import FractionalFactorialStrategy from bofire.utils.doe import get_confounding_matrix, get_generator, get_alias_structure def plot_design(design: pd.DataFrame): # we do a plot with three subplots in one row in which the three degrees of freedom (temperature, time and ph) are plotted _, axs = plt.subplots(1, 3, figsize=(15, 5)) axs[0].scatter(design['temperature'], design['time']) axs[0].set_xlabel('Temperature') axs[0].set_ylabel('Time') axs[1].scatter(design['temperature'], design['ph']) axs[1].set_xlabel('Temperature') axs[1].set_ylabel('pH') axs[2].scatter(design['time'], design['ph']) axs[2].set_xlabel('Time') axs[2].set_ylabel('pH') plt.show() In\u00a0[2]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"temperature\", bounds=(20,80)),\n ContinuousInput(key=\"time\", bounds=(60,120)),\n ContinuousInput(key=\"ph\", bounds=(7,13)),\n ],\n)\ndomain = Domain( inputs=[ ContinuousInput(key=\"temperature\", bounds=(20,80)), ContinuousInput(key=\"time\", bounds=(60,120)), ContinuousInput(key=\"ph\", bounds=(7,13)), ], ) In\u00a0[3]: Copied!
strategy_data = FractionalFactorialStrategy(\n domain=domain,\n n_center=1, # number of center points\n n_repetitions=1, # number of repetitions, we do only one round here\n)\nstrategy = strategies.map(strategy_data)\ndesign = strategy.ask()\ndisplay(design)\n\nplot_design(design=design)\nstrategy_data = FractionalFactorialStrategy( domain=domain, n_center=1, # number of center points n_repetitions=1, # number of repetitions, we do only one round here ) strategy = strategies.map(strategy_data) design = strategy.ask() display(design) plot_design(design=design) ph temperature time 0 7.0 20.0 60.0 1 7.0 20.0 120.0 2 7.0 80.0 60.0 3 7.0 80.0 120.0 4 13.0 20.0 60.0 5 13.0 20.0 120.0 6 13.0 80.0 60.0 7 13.0 80.0 120.0 8 10.0 50.0 90.0
The confounding structure is shown below, as expected for a full factorial design, no confound is present.
In\u00a0[4]: Copied!m = get_confounding_matrix(domain.inputs, design=design, interactions=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show()
Here a fractional factorial design of the form $2^{3-1}$ is setup by specifying the number of generators (here 1). In comparison to the full factorial design with 9 candidates, it features only 5 experiments.
In\u00a0[5]: Copied!strategy_data = FractionalFactorialStrategy(\n domain=domain,\n n_center=1, # number of center points\n n_repetitions=1, # number of repetitions, we do only one round here\n n_generators=1, # number of generators, ie number of reducing factors\n)\nstrategy = strategies.map(strategy_data)\ndesign = strategy.ask()\ndisplay(design)\nstrategy_data = FractionalFactorialStrategy( domain=domain, n_center=1, # number of center points n_repetitions=1, # number of repetitions, we do only one round here n_generators=1, # number of generators, ie number of reducing factors ) strategy = strategies.map(strategy_data) design = strategy.ask() display(design) ph temperature time 0 7.0 20.0 120.0 1 7.0 80.0 60.0 2 13.0 20.0 60.0 3 13.0 80.0 120.0 4 10.0 50.0 90.0
The generator string is automatically generated by making use of the method get_generator
and specifying the total number of factors (here 3) and the number of generators (here 1).
get_generator(n_factors=3, n_generators=1)\nget_generator(n_factors=3, n_generators=1) Out[7]:
'a b ab'
As expected for a type III design the main effects are confounded with the two factor interactions:
In\u00a0[8]: Copied!m = get_confounding_matrix(domain.inputs, design=design, interactions=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show()
This can also be expressed by the so called alias structure that can be calculated as following:
In\u00a0[12]: Copied!get_alias_structure(\"a b ab\")\nget_alias_structure(\"a b ab\") Out[12]:
['a = bc', 'b = ac', 'c = ab', 'I = abc']
Here again a fractional factorial design of the form $2^{3-1}$ is setup by providing the complete generator string of the form a b -ab
explicitly to the strategy.
strategy_data = FractionalFactorialStrategy(\n domain=domain,\n n_center=1, # number of center points\n n_repetitions=1, # number of repetitions, we do only one round here\n generator = \"a b -ab\" # the exact generator\n)\nstrategy = strategies.map(strategy_data)\ndesign = strategy.ask()\ndisplay(design)\nstrategy_data = FractionalFactorialStrategy( domain=domain, n_center=1, # number of center points n_repetitions=1, # number of repetitions, we do only one round here generator = \"a b -ab\" # the exact generator ) strategy = strategies.map(strategy_data) design = strategy.ask() display(design) ph temperature time 0 7.0 20.0 60.0 1 7.0 80.0 120.0 2 13.0 20.0 120.0 3 13.0 80.0 60.0 4 10.0 50.0 90.0
The last two designs differ only in the last feature time
, since the generator strings are different. In the first one it holds time=ph x temperature
whereas in the second it holds time=-ph x temperature
, which is also reflected in the confounding structure.
m = get_confounding_matrix(domain.inputs, design=design, interactions=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show()"},{"location":"fractional_factorial/#full-and-fractional-factorial-designs","title":"Full and Fractional Factorial Designs\u00b6","text":"
BoFire can be used to setup full (two level) and fractional factorial designs (https://en.wikipedia.org/wiki/Fractional_factorial_design). This tutorial notebook shows how.
"},{"location":"fractional_factorial/#imports-and-helper-functions","title":"Imports and helper functions\u00b6","text":""},{"location":"fractional_factorial/#setup-the-problem-domain","title":"Setup the problem domain\u00b6","text":"The designs are generated for a simple three dimensional problem comprised of three continuous factors/features.
"},{"location":"fractional_factorial/#setup-a-full-factorial-design","title":"Setup a full factorial design\u00b6","text":"Here we setup a full two-level factorial design including a center point and plot it.
"},{"location":"fractional_factorial/#setup-a-fractional-factorial-design","title":"Setup a fractional factorial design\u00b6","text":""},{"location":"getting_started/","title":"Getting started","text":"In\u00a0[1]: Copied!from bofire.data_models.features.api import ContinuousInput, DiscreteInput, CategoricalInput, CategoricalDescriptorInput\n\nx1 = ContinuousInput(key=\"x1\", bounds=(0,1))\nx2 = ContinuousInput(key=\"x2\", bounds=(0,1))\nx3 = ContinuousInput(key=\"x3\", bounds=(0,1))\nx4 = DiscreteInput(key=\"x4\", values=[1, 2, 5, 7.5])\nx5 = CategoricalInput(key=\"x5\", categories=[\"A\", \"B\", \"C\"], allowed=[True,True,False])\nx6 = CategoricalDescriptorInput(key=\"x6\", categories=[\"c1\", \"c2\", \"c3\"], descriptors=[\"d1\", \"d2\"], values = [[1,2],[2,5],[1,7]])\nfrom bofire.data_models.features.api import ContinuousInput, DiscreteInput, CategoricalInput, CategoricalDescriptorInput x1 = ContinuousInput(key=\"x1\", bounds=(0,1)) x2 = ContinuousInput(key=\"x2\", bounds=(0,1)) x3 = ContinuousInput(key=\"x3\", bounds=(0,1)) x4 = DiscreteInput(key=\"x4\", values=[1, 2, 5, 7.5]) x5 = CategoricalInput(key=\"x5\", categories=[\"A\", \"B\", \"C\"], allowed=[True,True,False]) x6 = CategoricalDescriptorInput(key=\"x6\", categories=[\"c1\", \"c2\", \"c3\"], descriptors=[\"d1\", \"d2\"], values = [[1,2],[2,5],[1,7]])
As output features, currently only continuous output features are supported. Each output feature should have an objective, which can be a minimize or maximize objective. Furthermore, we can define weights between 0 and 1 in case the objectives should not be weighted equally.
In\u00a0[2]: Copied!from bofire.data_models.features.api import ContinuousOutput\nfrom bofire.data_models.objectives.api import MaximizeObjective, MinimizeObjective\n\nobjective1 = MaximizeObjective(\n w=1.0, \n bounds= [0.0,1.0],\n)\ny1 = ContinuousOutput(key=\"y1\", objective=objective1)\n\nobjective2 = MinimizeObjective(\n w=1.0\n)\ny2 = ContinuousOutput(key=\"y2\", objective=objective2)\nfrom bofire.data_models.features.api import ContinuousOutput from bofire.data_models.objectives.api import MaximizeObjective, MinimizeObjective objective1 = MaximizeObjective( w=1.0, bounds= [0.0,1.0], ) y1 = ContinuousOutput(key=\"y1\", objective=objective1) objective2 = MinimizeObjective( w=1.0 ) y2 = ContinuousOutput(key=\"y2\", objective=objective2)
In- and output features are collected in respective feature lists.
In\u00a0[3]: Copied!from bofire.data_models.domain.api import Inputs, Outputs\n\ninput_features = Inputs(features = [x1, x2, x3, x4, x5, x6])\noutput_features = Outputs(features=[y1, y2])\nfrom bofire.data_models.domain.api import Inputs, Outputs input_features = Inputs(features = [x1, x2, x3, x4, x5, x6]) output_features = Outputs(features=[y1, y2])
A summary of the constraints can be obtained by the method get_reps_df
:
input_features.get_reps_df()\ninput_features.get_reps_df() Out[23]: Type Description x1 ContinuousInput [0.0,1.0] x2 ContinuousInput [0.0,1.0] x3 ContinuousInput [0.0,1.0] x4 DiscreteInput type='DiscreteInput' key='x4' unit=None values... x6 CategoricalDescriptorInput 3 categories x5 CategoricalInput 3 categories In\u00a0[24]: Copied!
output_features.get_reps_df()\noutput_features.get_reps_df() Out[24]: Type Description y1 ContinuousOutput ContinuousOutputFeature y2 ContinuousOutput ContinuousOutputFeature y3 ContinuousOutput ContinuousOutputFeature
Individual features can be retrieved by name.
In\u00a0[4]: Copied!x5 = input_features.get_by_key('x5')\nx5\nx5 = input_features.get_by_key('x5') x5 Out[4]:
CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])
This is also possible with list of feature names.
In\u00a0[5]: Copied!input_features.get_by_keys(['x5', 'x2'])\ninput_features.get_by_keys(['x5', 'x2']) Out[5]:
Inputs(type='Inputs', features=[ContinuousInput(type='ContinuousInput', key='x2', unit=None, bounds=(0.0, 1.0), local_relative_bounds=None, stepsize=None), CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])])
Features of a specific type can be returned by the get
method, by default it returns all features that are an instance of the provided class.
input_features.get(CategoricalInput)\ninput_features.get(CategoricalInput) Out[6]:
Inputs(type='Inputs', features=[CategoricalDescriptorInput(type='CategoricalDescriptorInput', key='x6', categories=['c1', 'c2', 'c3'], allowed=[True, True, True], descriptors=['d1', 'd2'], values=[[1.0, 2.0], [2.0, 5.0], [1.0, 7.0]]), CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])])
By using the exact
argument one can force it to only return feature of the exact same class.
input_features.get(CategoricalInput, exact=True)\ninput_features.get(CategoricalInput, exact=True) Out[7]:
Inputs(type='Inputs', features=[CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])])
The get_keys
method follows the same logic as the get
method but returns just the keys of the features instead of the features itself.
input_features.get_keys(CategoricalInput)\ninput_features.get_keys(CategoricalInput) Out[8]:
['x6', 'x5']
The input feature container further provides methods to return a feature container with only all fixed or all free features.
In\u00a0[9]: Copied!free_inputs = input_features.get_free()\nfixed_inputs = input_features.get_fixed()\nfree_inputs = input_features.get_free() fixed_inputs = input_features.get_fixed()
One can uniformly sample from individual input features.
In\u00a0[10]: Copied!x5.sample(2)\nx5.sample(2) Out[10]:
0 B\n1 A\nName: x5, dtype: object
Or directly from input feature containers, uniform, sobol and LHS sampling is possible. A default, uniform sampling is used.
In\u00a0[11]: Copied!from bofire.data_models.enum import SamplingMethodEnum\n\nX = input_features.sample(n=10, method=SamplingMethodEnum.LHS)\n\nX\nfrom bofire.data_models.enum import SamplingMethodEnum X = input_features.sample(n=10, method=SamplingMethodEnum.LHS) X Out[11]: x1 x2 x3 x4 x6 x5 0 0.423139 0.305001 0.881045 2.0 c3 A 1 0.873972 0.525925 0.674935 7.5 c3 A 2 0.782031 0.867259 0.442600 2.0 c1 B 3 0.691130 0.403864 0.348524 7.5 c3 B 4 0.051185 0.733657 0.144178 1.0 c2 A 5 0.939134 0.199665 0.226415 1.0 c1 A 6 0.323216 0.912386 0.066617 1.0 c1 B 7 0.280553 0.208415 0.544485 7.5 c3 A 8 0.163496 0.022924 0.707360 5.0 c2 B 9 0.554554 0.673069 0.938194 5.0 c1 B In\u00a0[12]: Copied!
from bofire.data_models.constraints.api import LinearEqualityConstraint, LinearInequalityConstraint\n\n# A mixture: x1 + x2 + x3 = 1\nconstr1 = LinearEqualityConstraint(features=[\"x1\", \"x2\", \"x3\"], coefficients=[1,1,1], rhs=1)\n\n# x1 + 2 * x3 < 0.8\nconstr2 = LinearInequalityConstraint(features=[\"x1\", \"x3\"], coefficients=[1, 2], rhs=0.8)\nfrom bofire.data_models.constraints.api import LinearEqualityConstraint, LinearInequalityConstraint # A mixture: x1 + x2 + x3 = 1 constr1 = LinearEqualityConstraint(features=[\"x1\", \"x2\", \"x3\"], coefficients=[1,1,1], rhs=1) # x1 + 2 * x3 < 0.8 constr2 = LinearInequalityConstraint(features=[\"x1\", \"x3\"], coefficients=[1, 2], rhs=0.8)
Linear constraints can only operate on ContinuousInput
features.
NonlinearEqualityConstraint
and NonlinearInequalityConstraint
take any expression that can be evaluated by pandas.eval, including mathematical operators such as sin
, exp
, log10
or exponentiation. So far, they cannot be used in any optimizations.
from bofire.data_models.constraints.api import NonlinearEqualityConstraint, NonlinearInequalityConstraint\n\n# The unit circle: x1**2 + x2**2 = 1\nconst3 = NonlinearEqualityConstraint(expression=\"x1**2 + x2**2 - 1\")\nconst3\nfrom bofire.data_models.constraints.api import NonlinearEqualityConstraint, NonlinearInequalityConstraint # The unit circle: x1**2 + x2**2 = 1 const3 = NonlinearEqualityConstraint(expression=\"x1**2 + x2**2 - 1\") const3 Out[13]:
NonlinearEqualityConstraint(type='NonlinearEqualityConstraint', expression='x1**2 + x2**2 - 1', features=None, jacobian_expression=None)In\u00a0[14]: Copied!
from bofire.data_models.constraints.api import NChooseKConstraint\n\n# Only 2 or 3 out of 3 parameters can be greater than zero\nconstr5 = NChooseKConstraint(features=[\"x1\", \"x2\", \"x3\"], min_count=2, max_count=3, none_also_valid=True)\nconstr5\nfrom bofire.data_models.constraints.api import NChooseKConstraint # Only 2 or 3 out of 3 parameters can be greater than zero constr5 = NChooseKConstraint(features=[\"x1\", \"x2\", \"x3\"], min_count=2, max_count=3, none_also_valid=True) constr5 Out[14]:
NChooseKConstraint(type='NChooseKConstraint', features=['x1', 'x2', 'x3'], min_count=2, max_count=3, none_also_valid=True)
Note that we have to set a boolean, if None is also a valid selection, e.g. if we want to have 2 or 3 or none of the ingredients in our recipe.
Similar to the features, constraints can be grouped in a container which acts as the union constraints.
In\u00a0[15]: Copied!from bofire.data_models.domain.api import Constraints\n\n\nconstraints = Constraints(constraints=[constr1, constr2])\nfrom bofire.data_models.domain.api import Constraints constraints = Constraints(constraints=[constr1, constr2])
A summary of the constraints can be obtained by the method get_reps_df
:
constraints.get_reps_df()\nconstraints.get_reps_df() Out[22]: Type Description 0 LinearEqualityConstraint type='LinearEqualityConstraint' features=['x1'... 1 LinearInequalityConstraint type='LinearInequalityConstraint' features=['x...
We can check whether a point satisfies individual constraints or the list of constraints.
In\u00a0[16]: Copied!constr2.is_fulfilled(X).values\nconstr2.is_fulfilled(X).values Out[16]:
array([False, False, False, False, True, False, True, False, False,\n False])
Output constraints can be setup via sigmoid-shaped objectives passed as argument to the respective feature, which can then also be plotted.
In\u00a0[17]: Copied!from bofire.data_models.objectives.api import MinimizeSigmoidObjective\nfrom bofire.plot.api import plot_objective_plotly\n\noutput_constraint = MinimizeSigmoidObjective(\n w=1.0, \n steepness=10,\n tp=0.5\n)\ny3= ContinuousOutput(key=\"y3\", objective=output_constraint)\n\noutput_features = Outputs(features=[y1, y2, y3])\n\nfig = plot_objective_plotly(feature=y3, lower=0, upper=1)\n\nfig.show()\nfrom bofire.data_models.objectives.api import MinimizeSigmoidObjective from bofire.plot.api import plot_objective_plotly output_constraint = MinimizeSigmoidObjective( w=1.0, steepness=10, tp=0.5 ) y3= ContinuousOutput(key=\"y3\", objective=output_constraint) output_features = Outputs(features=[y1, y2, y3]) fig = plot_objective_plotly(feature=y3, lower=0, upper=1) fig.show()
/opt/homebrew/Caskroom/miniforge/base/envs/bofire-2/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n from .autonotebook import tqdm as notebook_tqdm\nIn\u00a0[18]: Copied!
from bofire.data_models.domain.api import Domain\n\ndomain = Domain(\n inputs=input_features, \n outputs=output_features, \n constraints=constraints\n )\nfrom bofire.data_models.domain.api import Domain domain = Domain( inputs=input_features, outputs=output_features, constraints=constraints )
In addition one can instantiate the domain also just from lists.
In\u00a0[19]: Copied!domain_single_objective = Domain.from_lists(\n inputs=[x1, x2, x3, x4, x5, x6], \n outputs=[y1], \n constraints=[]\n )\ndomain_single_objective = Domain.from_lists( inputs=[x1, x2, x3, x4, x5, x6], outputs=[y1], constraints=[] ) In\u00a0[22]: Copied!
from bofire.data_models.strategies.api import RandomStrategy\n\nimport bofire.strategies.api as strategies\n\nstrategy_data_model = RandomStrategy(domain=domain)\n\nrandom_strategy = strategies.map(strategy_data_model)\nrandom_candidates = random_strategy.ask(2)\n\nrandom_candidates\nfrom bofire.data_models.strategies.api import RandomStrategy import bofire.strategies.api as strategies strategy_data_model = RandomStrategy(domain=domain) random_strategy = strategies.map(strategy_data_model) random_candidates = random_strategy.ask(2) random_candidates Out[22]: x1 x2 x3 x4 x6 x5 0 0.516301 0.358447 0.125253 7.5 c3 A 1 0.246566 0.636906 0.116528 2.0 c1 B In\u00a0[2]: Copied!
from bofire.benchmarks.single import Himmelblau\n\nbenchmark = Himmelblau()\n\n(benchmark.domain.inputs + benchmark.domain.outputs).get_reps_df()\nfrom bofire.benchmarks.single import Himmelblau benchmark = Himmelblau() (benchmark.domain.inputs + benchmark.domain.outputs).get_reps_df() Out[2]: Type Description x_1 ContinuousInput [-6.0,6.0] x_2 ContinuousInput [-6.0,6.0] y ContinuousOutput ContinuousOutputFeature
Generating some initial data works as follows:
In\u00a0[24]: Copied!samples = benchmark.domain.inputs.sample(10)\n\nexperiments = benchmark.f(samples, return_complete=True)\n\nexperiments\nsamples = benchmark.domain.inputs.sample(10) experiments = benchmark.f(samples, return_complete=True) experiments Out[24]: x_1 x_2 y valid_y 0 -5.207328 3.267036 378.064959 1 1 -3.542455 5.285482 349.256442 1 2 -5.155535 5.077326 612.311571 1 3 -5.316850 3.642571 438.194554 1 4 -3.701859 -5.987050 642.945914 1 5 -1.165247 -0.212096 163.045785 1 6 3.267629 2.292458 6.199849 1 7 -0.915547 1.141966 125.068321 1 8 -2.672275 -1.027612 98.118896 1 9 5.363115 -4.279275 459.876833 1
Let's setup the SOBO strategy and ask for a candidate.
In\u00a0[25]: Copied!from bofire.data_models.strategies.api import SoboStrategy\nfrom bofire.data_models.acquisition_functions.api import qNEI\n\nsobo_strategy_data_model = SoboStrategy(domain=benchmark.domain, acquisition_function=qNEI())\n\nsobo_strategy = strategies.map(sobo_strategy_data_model)\n\nsobo_strategy.tell(experiments=experiments)\n\nsobo_strategy.ask(candidate_count=1)\nfrom bofire.data_models.strategies.api import SoboStrategy from bofire.data_models.acquisition_functions.api import qNEI sobo_strategy_data_model = SoboStrategy(domain=benchmark.domain, acquisition_function=qNEI()) sobo_strategy = strategies.map(sobo_strategy_data_model) sobo_strategy.tell(experiments=experiments) sobo_strategy.ask(candidate_count=1) Out[25]: x_1 x_2 y_pred y_sd y_des 0 2.185807 5.14596 48.612437 208.728779 -48.612437 In\u00a0[26]: Copied!
from bofire.strategies.doe.design import find_local_max_ipopt\nimport numpy as np\n\ndomain = Domain(\n inputs=[x1,x2,x3],\n outputs=[y1],\n constraints=[constr1]\n )\n\nres = find_local_max_ipopt(domain, \"fully-quadratic\")\nnp.round(res,3)\nfrom bofire.strategies.doe.design import find_local_max_ipopt import numpy as np domain = Domain( inputs=[x1,x2,x3], outputs=[y1], constraints=[constr1] ) res = find_local_max_ipopt(domain, \"fully-quadratic\") np.round(res,3)
\n******************************************************************************\nThis program contains Ipopt, a library for large-scale nonlinear optimization.\n Ipopt is released as open source code under the Eclipse Public License (EPL).\n For more information visit https://github.com/coin-or/Ipopt\n******************************************************************************\n\nOut[26]: x1 x2 x3 exp0 0.5 0.5 -0.0 exp1 -0.0 1.0 -0.0 exp2 -0.0 0.5 0.5 exp3 -0.0 0.5 0.5 exp4 0.5 -0.0 0.5 exp5 0.5 0.5 -0.0 exp6 -0.0 1.0 -0.0 exp7 1.0 -0.0 -0.0 exp8 -0.0 -0.0 1.0 exp9 -0.0 -0.0 1.0 exp10 0.5 -0.0 0.5 exp11 0.5 -0.0 0.5 exp12 0.5 0.5 -0.0
The resulting design looks like this:
In\u00a0[27]: Copied!import matplotlib.pyplot as plt\n\nfig = plt.figure(figsize=((10,10)))\nax = fig.add_subplot(111, projection='3d')\nax.view_init(45, 45)\nax.set_title(\"fully-quadratic model\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\nplt.rcParams[\"figure.figsize\"] = (10,8)\n\n#plot feasible polytope\nax.plot(\n xs=[1,0,0,1],\n ys=[0,1,0,0],\n zs=[0,0,1,0],\n linewidth=2\n)\n\n#plot D-optimal solutions\nax.scatter(xs=res[\"x1\"], ys=res[\"x2\"], zs=res[\"x3\"], marker=\"o\", s=40, color=\"orange\")\nimport matplotlib.pyplot as plt fig = plt.figure(figsize=((10,10))) ax = fig.add_subplot(111, projection='3d') ax.view_init(45, 45) ax.set_title(\"fully-quadratic model\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") plt.rcParams[\"figure.figsize\"] = (10,8) #plot feasible polytope ax.plot( xs=[1,0,0,1], ys=[0,1,0,0], zs=[0,0,1,0], linewidth=2 ) #plot D-optimal solutions ax.scatter(xs=res[\"x1\"], ys=res[\"x2\"], zs=res[\"x3\"], marker=\"o\", s=40, color=\"orange\") Out[27]:
<mpl_toolkits.mplot3d.art3d.Path3DCollection at 0x17d571330>In\u00a0[\u00a0]: Copied!
\n"},{"location":"getting_started/#getting-started","title":"Getting started\u00b6","text":"
In the following it is showed how to setup optimization problems in BoFire and how to use strategies to solve them.
"},{"location":"getting_started/#setting-up-the-optimization-problem","title":"Setting up the optimization problem\u00b6","text":"In BoFire, an optimization problem is defined by defining a domain containing input and output features as well as constraints (optional).
"},{"location":"getting_started/#features","title":"Features\u00b6","text":"Input features can be continuous, discrete, categorical, or categorical with descriptors:
"},{"location":"getting_started/#constraints","title":"Constraints\u00b6","text":"The search space can be further defined by constraints on the input features. BoFire supports linear equality and inequality constraints, as well as non-linear equality and inequality constraints.
"},{"location":"getting_started/#linear-constraints","title":"Linear constraints\u00b6","text":"LinearEqualityConstraint
and LinearInequalityConstraint
are expressions of the form $\\sum_i a_i x_i = b$ or $\\leq b$ for equality and inequality constraints respectively. They take a list of names of the input features they are operating on, a list of left-hand-side coefficients $a_i$ and a right-hand-side constant $b$.
Use NChooseKConstraint
to express that we only want to have $k$ out of the $n$ parameters to take positive values. Think of a mixture, where we have long list of possible ingredients, but want to limit number of ingredients in any given recipe.
The domain holds then all information about an optimization problem and can be understood as a search space defintion.
"},{"location":"getting_started/#optimization","title":"Optimization\u00b6","text":"To solve the optimization problem, we further need a solving strategy. BoFire supports strategies without a prediction model such as a random strategy and predictive strategies which are based on a prediction model.
All strategies contain an ask
method returning a defined number of candidate experiments.
Since a predictive strategy includes a prediction model, we need to generate some historical data, which we can afterwards pass as training data to the strategy via the tell method.
For didactic purposes we just choose here from one of our benchmark methods.
"},{"location":"getting_started/#design-of-experiments","title":"Design of Experiments\u00b6","text":"As a simple example for the DoE functionalities we consider the task of finding a D-optimal design for a fully-quadratic model with three design variables with bounds (0,1) and a mixture constraint.
We define the design space including the constraint as a domain. Then we pass it to the optimization routine and specify the model. If the user does not indicate a number of experiments it will be chosen automatically based on the number of model terms.
"},{"location":"install/","title":"Installation","text":"In BoFire we have several optional depencies.
"},{"location":"install/#domain-and-optimization-algorithms","title":"Domain and Optimization Algorithms","text":"To install BoFire with optimization tools you can use
pip install bofire[optimization]\n
This will also install BoTorch that depends on PyTorch."},{"location":"install/#design-of-experiments","title":"Design of Experiments","text":"BoFire has functionality to create D-optimal experimental designs via the doe
module. This module is depends on Cyipopt. A comfortable way to install Cyipopt and the dependencies is via
conda install -c conda-forge cyipopt\n
You have to install Cyipopt manually."},{"location":"install/#just-domain","title":"Just Domain","text":"If you just want a data structure that represents the domain of an optimization problem you can
pip install bofire\n
"},{"location":"install/#cheminformatics","title":"Cheminformatics","text":"Some features related to molecules and their representation depend on Rdkit.
pip install bofire[optimization,cheminfo]\n
"},{"location":"install/#development-installation","title":"Development Installation","text":"If you want to contribute to BoFire, you might want to install in editable mode including the test dependencies. After cloning the repository via
git clone https://github.com/experimental-design/bofire.git\n
and cd bofire
, you can proceed with pip install -e .[optimization,cheminfo,docs,tests]\n
"},{"location":"nchoosek_constraint/","title":"Nchoosek constraint","text":"In\u00a0[10]: Copied! from bofire.strategies.doe.design import find_local_max_ipopt\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.constraints.api import NChooseKConstraint, LinearEqualityConstraint, LinearInequalityConstraint\nfrom bofire.data_models.features.api import ContinuousInput, ContinuousOutput\nimport numpy as np\n\ndomain = Domain(\n inputs = [ContinuousInput(key=f\"x{i+1}\", bounds=(0,1)) for i in range(8)],\n outputs = [ContinuousOutput(key=\"y\")],\n constraints = [\n LinearEqualityConstraint(features=[f\"x{i+1}\" for i in range(8)], coefficients=[1,1,1,1,1,1,1,1], rhs=1),\n NChooseKConstraint(features=[\"x1\",\"x2\",\"x3\"], min_count=0, max_count=1, none_also_valid=True),\n LinearInequalityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=0.7),\n LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[-1,-1], rhs=-0.1),\n LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[1,1], rhs=0.9),\n ]\n)\n\nres = find_local_max_ipopt(\n domain=domain,\n model_type=\"fully-quadratic\",\n ipopt_options={\"maxiter\":500},\n)\nnp.round(res,3)\nfrom bofire.strategies.doe.design import find_local_max_ipopt from bofire.data_models.domain.api import Domain from bofire.data_models.constraints.api import NChooseKConstraint, LinearEqualityConstraint, LinearInequalityConstraint from bofire.data_models.features.api import ContinuousInput, ContinuousOutput import numpy as np domain = Domain( inputs = [ContinuousInput(key=f\"x{i+1}\", bounds=(0,1)) for i in range(8)], outputs = [ContinuousOutput(key=\"y\")], constraints = [ LinearEqualityConstraint(features=[f\"x{i+1}\" for i in range(8)], coefficients=[1,1,1,1,1,1,1,1], rhs=1), NChooseKConstraint(features=[\"x1\",\"x2\",\"x3\"], min_count=0, max_count=1, none_also_valid=True), LinearInequalityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=0.7), LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[-1,-1], rhs=-0.1), LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[1,1], rhs=0.9), ] ) res = find_local_max_ipopt( domain=domain, model_type=\"fully-quadratic\", ipopt_options={\"maxiter\":500}, ) np.round(res,3) Out[10]: x1 x2 x3 x4 x5 x6 x7 x8 exp0 0.000 0.00 0.463 -0.000 0.437 -0.000 0.100 -0.000 exp1 0.449 0.00 0.000 -0.000 -0.000 -0.000 -0.000 0.551 exp2 0.000 0.50 0.000 -0.000 -0.000 -0.000 -0.000 0.500 exp3 0.000 0.00 0.700 0.200 -0.000 -0.000 -0.000 0.100 exp4 0.394 0.00 0.000 -0.000 0.506 -0.000 0.100 -0.000 exp5 0.000 0.45 0.000 -0.000 -0.000 0.450 0.029 0.071 exp6 0.000 0.00 0.700 -0.000 -0.000 -0.000 0.300 -0.000 exp7 0.700 0.00 0.000 -0.000 0.200 -0.000 -0.000 0.100 exp8 0.000 -0.00 0.000 -0.000 0.448 -0.000 0.552 -0.000 exp9 0.000 0.00 -0.000 -0.000 0.498 -0.000 -0.000 0.502 exp10 -0.000 0.00 0.000 -0.000 -0.000 0.900 0.100 -0.000 exp11 0.000 -0.00 0.000 -0.000 0.900 -0.000 0.100 -0.000 exp12 0.000 0.00 0.371 -0.000 -0.000 0.529 -0.000 0.100 exp13 0.700 0.00 0.000 -0.000 -0.000 0.200 0.100 -0.000 exp14 0.000 -0.00 0.000 0.100 -0.000 -0.000 0.900 -0.000 exp15 0.000 0.00 0.100 -0.000 -0.000 -0.000 0.443 0.457 exp16 -0.000 0.00 0.000 -0.000 0.450 0.450 0.043 0.057 exp17 0.000 0.70 0.000 -0.000 -0.000 -0.000 0.300 -0.000 exp18 0.000 0.00 -0.000 -0.000 -0.000 0.445 0.555 -0.000 exp19 -0.000 0.00 0.000 0.539 -0.000 -0.000 0.461 -0.000 exp20 0.000 0.35 0.000 -0.000 -0.000 -0.000 0.650 -0.000 exp21 0.000 0.00 0.404 -0.000 -0.000 0.496 0.100 -0.000 exp22 0.491 0.00 0.000 -0.000 -0.000 -0.000 0.509 -0.000 exp23 0.000 0.35 0.000 -0.000 -0.000 -0.000 0.650 -0.000 exp24 0.000 0.00 0.446 -0.000 -0.000 -0.000 -0.000 0.554 exp25 0.384 0.00 0.000 -0.000 -0.000 0.516 -0.000 0.100 exp26 0.000 0.45 0.000 0.450 -0.000 -0.000 0.028 0.072 exp27 0.000 0.00 0.440 -0.000 0.460 -0.000 -0.000 0.100 exp28 0.393 0.00 0.000 0.507 -0.000 -0.000 0.100 -0.000 exp29 0.000 -0.00 0.000 0.450 0.450 -0.000 0.049 0.051 exp30 0.000 0.00 0.700 -0.000 -0.000 0.200 -0.000 0.100 exp31 0.100 0.00 0.000 -0.000 -0.000 -0.000 0.454 0.446 exp32 0.000 -0.00 0.000 -0.000 0.448 -0.000 0.552 -0.000 exp33 0.000 0.00 0.374 -0.000 -0.000 -0.000 0.626 -0.000 exp34 0.388 0.00 0.000 -0.000 -0.000 0.512 0.100 -0.000 exp35 0.000 -0.00 0.000 0.455 -0.000 0.445 0.100 -0.000 exp36 0.000 0.00 0.394 0.506 -0.000 -0.000 0.100 -0.000 exp37 -0.000 0.00 0.000 0.448 -0.000 -0.000 -0.000 0.552 exp38 0.000 0.45 0.000 -0.000 0.450 -0.000 0.023 0.077 exp39 0.000 0.00 -0.000 0.539 -0.000 -0.000 0.461 -0.000 exp40 -0.000 0.00 0.000 -0.000 -0.000 0.445 0.555 -0.000 exp41 0.000 -0.00 0.000 -0.000 -0.000 0.541 -0.000 0.459 exp42 0.000 0.00 -0.000 0.442 -0.000 0.458 -0.000 0.100 exp43 0.700 0.00 0.000 0.200 -0.000 -0.000 -0.000 0.100 exp44 0.000 -0.00 0.000 -0.000 -0.000 0.100 -0.000 0.900 exp45 0.000 0.00 -0.000 0.448 -0.000 -0.000 -0.000 0.552 exp46 -0.000 0.00 0.000 0.900 -0.000 -0.000 -0.000 0.100 exp47 0.000 -0.00 0.000 -0.000 0.498 -0.000 -0.000 0.502 In\u00a0[\u00a0]: Copied!
\n"},{"location":"nchoosek_constraint/#design-with-nchoosek-constraint","title":"Design with NChooseK constraint\u00b6","text":"
The doe subpackage also supports problems with NChooseK constraints. Since IPOPT has problems finding feasible solutions using the gradient of the NChooseK constraint violation, a closely related (but stricter) constraint that suffices to fulfill the NChooseK constraint is imposed onto the problem: For each experiment $j$ N-K decision variables $x_{i_1,j},...,x_{i_{N-K,j}}$ from the NChooseK constraints' names attribute are picked that are forced to be zero. This is done by setting the upper and lower bounds of the picked variables are set to 0 in the corresponding experiments. This causes IPOPT to treat them as \"fixed variables\" (i.e. it will not optimize for them) and will always stick to the only feasible value (which is 0 here). However, this constraint is stricter than the original NChooseK constraint. In combination with other constraints on the same decision variables this can result in a situation where the constraints cannot be fulfilled even though the original constraints would allow for a solution. For example consider a problem with four decision variables $x_1, x_2, x_3, x_4$, an NChooseK constraint on the first four variable that restricts the number of nonzero variables to two. Additionally, we have a linear constraint $$ x_3 + x_4 \\geq 0.1 $$ We can easily find points that fulfill both constraints (e.g. $(0,0,0,0.1)$). Now consider the stricter, linear constraint from above. Eventually, it will happen that $x_3$ and $x_4$ are chosen to be zero for one experiment. For this experiment it is impossible to fulfill the linear constraint $x_3 + x_4 \\geq 0.1$ since $x_3 = x_4 = 0$.
Therefore one has to be very careful when imposing linear constraints upon decision variables that already show up in an NChooseK constraint.
For practical reasons it necessary that two NChooseK constraints of the same problem must not share any variables.
You can find an example for a problem with NChooseK constraints and additional linear constraints imposed on the same variables.
"},{"location":"optimality_criteria/","title":"Optimality criteria","text":"In\u00a0[1]: Copied!import numpy as np\nimport matplotlib.pyplot as plt\n\nfrom bofire.data_models.constraints.api import (\n NonlinearEqualityConstraint,\n NonlinearInequalityConstraint,\n LinearEqualityConstraint,\n LinearInequalityConstraint,\n)\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.features.api import ContinuousInput, ContinuousOutput\nfrom bofire.strategies.doe.design import find_local_max_ipopt\nfrom bofire.strategies.enum import OptimalityCriterionEnum\nimport numpy as np import matplotlib.pyplot as plt from bofire.data_models.constraints.api import ( NonlinearEqualityConstraint, NonlinearInequalityConstraint, LinearEqualityConstraint, LinearInequalityConstraint, ) from bofire.data_models.domain.api import Domain from bofire.data_models.features.api import ContinuousInput, ContinuousOutput from bofire.strategies.doe.design import find_local_max_ipopt from bofire.strategies.enum import OptimalityCriterionEnum
/opt/homebrew/Caskroom/miniforge/base/envs/bofire/lib/python3.10/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n from .autonotebook import tqdm as notebook_tqdm\nIn\u00a0[2]: Copied!
# Optimal designs for a quadratic model on the unit square\ndomain = Domain(\n inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(2)],\n outputs=[ContinuousOutput(key=\"y\")],\n)\nmodel_type = \"fully-quadratic\"\nn_experiments = 13\n\ndesigns = {}\nfor obj in OptimalityCriterionEnum:\n designs[obj.value] = find_local_max_ipopt(\n domain,\n model_type=model_type,\n n_experiments=n_experiments,\n objective=obj,\n ipopt_options={\"maxiter\": 300},\n ).to_numpy()\n\nfig = plt.figure(figsize=((8, 8)))\nax = fig.add_subplot(111)\nax.set_title(\"Designs with different optimality criteria\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nfor obj, X in designs.items():\n ax.scatter(X[:, 0], X[:, 1], s=40, label=obj)\nax.grid(alpha=0.3)\nax.legend();\n# Optimal designs for a quadratic model on the unit square domain = Domain( inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(2)], outputs=[ContinuousOutput(key=\"y\")], ) model_type = \"fully-quadratic\" n_experiments = 13 designs = {} for obj in OptimalityCriterionEnum: designs[obj.value] = find_local_max_ipopt( domain, model_type=model_type, n_experiments=n_experiments, objective=obj, ipopt_options={\"maxiter\": 300}, ).to_numpy() fig = plt.figure(figsize=((8, 8))) ax = fig.add_subplot(111) ax.set_title(\"Designs with different optimality criteria\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") for obj, X in designs.items(): ax.scatter(X[:, 0], X[:, 1], s=40, label=obj) ax.grid(alpha=0.3) ax.legend();
\n******************************************************************************\nThis program contains Ipopt, a library for large-scale nonlinear optimization.\n Ipopt is released as open source code under the Eclipse Public License (EPL).\n For more information visit https://github.com/coin-or/Ipopt\n******************************************************************************\n\nIn\u00a0[3]: Copied!
# Space filling design on the unit 2-simplex\ndomain = Domain(\n inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(3)],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[\n LinearEqualityConstraint(\n features=[\"x1\", \"x2\", \"x3\"], coefficients=[1, 1, 1], rhs=1\n )\n ],\n)\n\nX = find_local_max_ipopt(\n domain,\n n_experiments=40,\n model_type=\"linear\", # the model type does not matter for space filling designs\n objective=OptimalityCriterionEnum.SPACE_FILLING,\n ipopt_options={\"maxiter\": 500},\n).to_numpy()\n\n\nfig = plt.figure(figsize=((10, 8)))\nax = fig.add_subplot(111, projection=\"3d\")\nax.view_init(45, 20)\nax.set_title(\"Space filling design\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\n\n# plot feasible polytope\nax.plot(xs=[0, 0, 1, 0], ys=[0, 1, 0, 0], zs=[1, 0, 0, 1], linewidth=2)\n\n# plot design points\nax.scatter(xs=X[:, 0], ys=X[:, 1], zs=X[:, 2], s=40)\n# Space filling design on the unit 2-simplex domain = Domain( inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(3)], outputs=[ContinuousOutput(key=\"y\")], constraints=[ LinearEqualityConstraint( features=[\"x1\", \"x2\", \"x3\"], coefficients=[1, 1, 1], rhs=1 ) ], ) X = find_local_max_ipopt( domain, n_experiments=40, model_type=\"linear\", # the model type does not matter for space filling designs objective=OptimalityCriterionEnum.SPACE_FILLING, ipopt_options={\"maxiter\": 500}, ).to_numpy() fig = plt.figure(figsize=((10, 8))) ax = fig.add_subplot(111, projection=\"3d\") ax.view_init(45, 20) ax.set_title(\"Space filling design\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") # plot feasible polytope ax.plot(xs=[0, 0, 1, 0], ys=[0, 1, 0, 0], zs=[1, 0, 0, 1], linewidth=2) # plot design points ax.scatter(xs=X[:, 0], ys=X[:, 1], zs=X[:, 2], s=40) Out[3]:
<mpl_toolkits.mplot3d.art3d.Path3DCollection at 0x2ac85e170>In\u00a0[\u00a0]: Copied!
\n"},{"location":"optimality_criteria/#designs-for-different-optimality-criteria","title":"Designs for different optimality criteria\u00b6","text":""},{"location":"optimality_criteria/#space-filling-design","title":"Space filling design\u00b6","text":""},{"location":"ref-constraints/","title":"Domain","text":""},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints","title":"
Constraints (BaseModel, Generic)
","text":"Source code in bofire/data_models/domain/constraints.py
class Constraints(BaseModel, Generic[C]):\n type: Literal[\"Constraints\"] = \"Constraints\"\n constraints: Sequence[C] = Field(default_factory=lambda: [])\n\n def __iter__(self) -> Iterator[C]:\n return iter(self.constraints)\n\n def __len__(self):\n return len(self.constraints)\n\n def __getitem__(self, i) -> C:\n return self.constraints[i]\n\n def __add__(\n self, other: Union[Sequence[CIncludes], \"Constraints[CIncludes]\"]\n ) -> \"Constraints[Union[C, CIncludes]]\":\n if isinstance(other, collections.abc.Sequence):\n other_constraints = other\n else:\n other_constraints = other.constraints\n constraints = list(chain(self.constraints, other_constraints))\n return Constraints(constraints=constraints)\n\n def __call__(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Numerically evaluate all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint on\n\n Returns:\n pd.DataFrame: Constraint evaluation for each of the constraints\n \"\"\"\n return pd.concat([c(experiments) for c in self.constraints], axis=1)\n\n def jacobian(self, experiments: pd.DataFrame) -> list:\n \"\"\"Numerically evaluate the jacobians of all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint jacobians on\n\n Returns:\n list: A list containing the jacobians as pd.DataFrames\n \"\"\"\n return [c.jacobian(experiments) for c in self.constraints]\n\n def is_fulfilled(self, experiments: pd.DataFrame, tol: float = 1e-6) -> pd.Series:\n \"\"\"Check if all constraints are fulfilled on all rows of the provided dataframe\n\n Args:\n experiments (pd.DataFrame): Dataframe with data, the constraint validity should be tested on\n tol (float, optional): tolerance parameter. A constraint is considered as not fulfilled if\n the violation is larger than tol. Defaults to 0.\n\n Returns:\n Boolean: True if all constraints are fulfilled for all rows, false if not\n \"\"\"\n if len(self.constraints) == 0:\n return pd.Series([True] * len(experiments), index=experiments.index)\n return (\n pd.concat(\n [c.is_fulfilled(experiments, tol) for c in self.constraints], axis=1\n )\n .fillna(True)\n .all(axis=1)\n )\n\n def get(\n self,\n includes: Union[Type[CIncludes], Sequence[Type[CIncludes]]] = Constraint,\n excludes: Optional[Union[Type[CExcludes], List[Type[CExcludes]]]] = None,\n exact: bool = False,\n ) -> \"Constraints[CIncludes]\":\n \"\"\"Get constraints of the domain\n\n Args:\n includes: Constraint class or list of specific constraint classes to be returned. Defaults to Constraint.\n excludes: Constraint class or list of specific constraint classes to be excluded from the return. Defaults to None.\n exact: Boolean to distinguish if only the exact class listed in includes and no subclasses inherenting from this class shall be returned. Defaults to False.\n\n Returns:\n Constraints: constraints in the domain fitting to the passed requirements.\n \"\"\"\n return Constraints(\n constraints=filter_by_class(\n self.constraints,\n includes=includes,\n excludes=excludes,\n exact=exact,\n )\n )\n\n def get_reps_df(self):\n \"\"\"Provides a tabular overwiev of all constraints within the domain\n\n Returns:\n pd.DataFrame: DataFrame listing all constraints of the domain with a description\n \"\"\"\n df = pd.DataFrame(\n index=range(len(self.constraints)),\n columns=[\"Type\", \"Description\"],\n data={\n \"Type\": [feat.__class__.__name__ for feat in self.get(Constraint)],\n \"Description\": [\n constraint.__str__() for constraint in self.get(Constraint)\n ],\n },\n )\n return df\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.__call__","title":"__call__(self, experiments)
special
","text":"Numerically evaluate all constraints
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
data to evaluate the constraint on
requiredReturns:
Type Descriptionpd.DataFrame
Constraint evaluation for each of the constraints
Source code inbofire/data_models/domain/constraints.py
def __call__(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Numerically evaluate all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint on\n\n Returns:\n pd.DataFrame: Constraint evaluation for each of the constraints\n \"\"\"\n return pd.concat([c(experiments) for c in self.constraints], axis=1)\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.get","title":"get(self, includes=<class 'bofire.data_models.constraints.constraint.Constraint'>, excludes=None, exact=False)
","text":"Get constraints of the domain
Parameters:
Name Type Description Defaultincludes
Union[Type[~CIncludes], Sequence[Type[~CIncludes]]]
Constraint class or list of specific constraint classes to be returned. Defaults to Constraint.
<class 'bofire.data_models.constraints.constraint.Constraint'>
excludes
Union[Type[~CExcludes], List[Type[~CExcludes]]]
Constraint class or list of specific constraint classes to be excluded from the return. Defaults to None.
None
exact
bool
Boolean to distinguish if only the exact class listed in includes and no subclasses inherenting from this class shall be returned. Defaults to False.
False
Returns:
Type DescriptionConstraints
constraints in the domain fitting to the passed requirements.
Source code inbofire/data_models/domain/constraints.py
def get(\n self,\n includes: Union[Type[CIncludes], Sequence[Type[CIncludes]]] = Constraint,\n excludes: Optional[Union[Type[CExcludes], List[Type[CExcludes]]]] = None,\n exact: bool = False,\n) -> \"Constraints[CIncludes]\":\n \"\"\"Get constraints of the domain\n\n Args:\n includes: Constraint class or list of specific constraint classes to be returned. Defaults to Constraint.\n excludes: Constraint class or list of specific constraint classes to be excluded from the return. Defaults to None.\n exact: Boolean to distinguish if only the exact class listed in includes and no subclasses inherenting from this class shall be returned. Defaults to False.\n\n Returns:\n Constraints: constraints in the domain fitting to the passed requirements.\n \"\"\"\n return Constraints(\n constraints=filter_by_class(\n self.constraints,\n includes=includes,\n excludes=excludes,\n exact=exact,\n )\n )\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.get_reps_df","title":"get_reps_df(self)
","text":"Provides a tabular overwiev of all constraints within the domain
Returns:
Type Descriptionpd.DataFrame
DataFrame listing all constraints of the domain with a description
Source code inbofire/data_models/domain/constraints.py
def get_reps_df(self):\n \"\"\"Provides a tabular overwiev of all constraints within the domain\n\n Returns:\n pd.DataFrame: DataFrame listing all constraints of the domain with a description\n \"\"\"\n df = pd.DataFrame(\n index=range(len(self.constraints)),\n columns=[\"Type\", \"Description\"],\n data={\n \"Type\": [feat.__class__.__name__ for feat in self.get(Constraint)],\n \"Description\": [\n constraint.__str__() for constraint in self.get(Constraint)\n ],\n },\n )\n return df\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.is_fulfilled","title":"is_fulfilled(self, experiments, tol=1e-06)
","text":"Check if all constraints are fulfilled on all rows of the provided dataframe
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe with data, the constraint validity should be tested on
requiredtol
float
tolerance parameter. A constraint is considered as not fulfilled if the violation is larger than tol. Defaults to 0.
1e-06
Returns:
Type DescriptionBoolean
True if all constraints are fulfilled for all rows, false if not
Source code inbofire/data_models/domain/constraints.py
def is_fulfilled(self, experiments: pd.DataFrame, tol: float = 1e-6) -> pd.Series:\n \"\"\"Check if all constraints are fulfilled on all rows of the provided dataframe\n\n Args:\n experiments (pd.DataFrame): Dataframe with data, the constraint validity should be tested on\n tol (float, optional): tolerance parameter. A constraint is considered as not fulfilled if\n the violation is larger than tol. Defaults to 0.\n\n Returns:\n Boolean: True if all constraints are fulfilled for all rows, false if not\n \"\"\"\n if len(self.constraints) == 0:\n return pd.Series([True] * len(experiments), index=experiments.index)\n return (\n pd.concat(\n [c.is_fulfilled(experiments, tol) for c in self.constraints], axis=1\n )\n .fillna(True)\n .all(axis=1)\n )\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.jacobian","title":"jacobian(self, experiments)
","text":"Numerically evaluate the jacobians of all constraints
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
data to evaluate the constraint jacobians on
requiredReturns:
Type Descriptionlist
A list containing the jacobians as pd.DataFrames
Source code inbofire/data_models/domain/constraints.py
def jacobian(self, experiments: pd.DataFrame) -> list:\n \"\"\"Numerically evaluate the jacobians of all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint jacobians on\n\n Returns:\n list: A list containing the jacobians as pd.DataFrames\n \"\"\"\n return [c.jacobian(experiments) for c in self.constraints]\n
"},{"location":"ref-domain-util/","title":"Domain","text":""},{"location":"ref-domain-util/#bofire.utils.cheminformatics","title":"cheminformatics
","text":""},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2fingerprints","title":"smiles2fingerprints(smiles, bond_radius=5, n_bits=2048)
","text":"Transforms a list of smiles to an array of morgan fingerprints.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredbond_radius
int
Bond radius to use. Defaults to 5.
5
n_bits
int
Number of bits. Defaults to 2048.
2048
Returns:
Type Descriptionnp.ndarray
Numpy array holding the fingerprints
Source code inbofire/utils/cheminformatics.py
def smiles2fingerprints(\n smiles: List[str], bond_radius: int = 5, n_bits: int = 2048\n) -> np.ndarray:\n \"\"\"Transforms a list of smiles to an array of morgan fingerprints.\n\n Args:\n smiles (List[str]): List of smiles\n bond_radius (int, optional): Bond radius to use. Defaults to 5.\n n_bits (int, optional): Number of bits. Defaults to 2048.\n\n Returns:\n np.ndarray: Numpy array holding the fingerprints\n \"\"\"\n rdkit_mols = [smiles2mol(m) for m in smiles]\n fps = [\n AllChem.GetMorganFingerprintAsBitVect( # type: ignore\n mol, radius=bond_radius, nBits=n_bits\n )\n for mol in rdkit_mols\n ]\n\n return np.asarray(fps)\n
"},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2fragments","title":"smiles2fragments(smiles, fragments_list=None)
","text":"Transforms smiles to an array of fragments.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredReturns:
Type Descriptionnp.ndarray
Array holding the fragment information.
Source code inbofire/utils/cheminformatics.py
def smiles2fragments(\n smiles: List[str], fragments_list: Optional[List[str]] = None\n) -> np.ndarray:\n \"\"\"Transforms smiles to an array of fragments.\n\n Args:\n smiles (List[str]): List of smiles\n\n Returns:\n np.ndarray: Array holding the fragment information.\n \"\"\"\n rdkit_fragment_list = [\n item for item in Descriptors.descList if item[0].startswith(\"fr_\")\n ]\n if fragments_list is None:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list}\n else:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list if d[0] in fragments_list}\n\n frags = np.zeros((len(smiles), len(fragments)))\n for i, smi in enumerate(smiles):\n mol = smiles2mol(smi)\n features = [fragments[d](mol) for d in fragments]\n frags[i, :] = features\n\n return frags\n
"},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2mol","title":"smiles2mol(smiles)
","text":"Transforms a smiles string to an rdkit mol object.
Parameters:
Name Type Description Defaultsmiles
str
Smiles string.
requiredExceptions:
Type DescriptionValueError
If string is not a valid smiles.
Returns:
Type Descriptionrdkit.Mol
rdkit.mol object
Source code inbofire/utils/cheminformatics.py
def smiles2mol(smiles: str):\n \"\"\"Transforms a smiles string to an rdkit mol object.\n\n Args:\n smiles (str): Smiles string.\n\n Raises:\n ValueError: If string is not a valid smiles.\n\n Returns:\n rdkit.Mol: rdkit.mol object\n \"\"\"\n mol = MolFromSmiles(smiles)\n if mol is None:\n raise ValueError(f\"{smiles} is not a valid smiles string.\")\n return mol\n
"},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2mordred","title":"smiles2mordred(smiles, descriptors_list)
","text":"Transforms list of smiles to mordred moelcular descriptors.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requireddescriptors_list
List[str]
List of desired mordred descriptors
requiredReturns:
Type Descriptionnp.ndarray
Array holding the mordred moelcular descriptors.
Source code inbofire/utils/cheminformatics.py
def smiles2mordred(smiles: List[str], descriptors_list: List[str]) -> np.ndarray:\n \"\"\"Transforms list of smiles to mordred moelcular descriptors.\n\n Args:\n smiles (List[str]): List of smiles\n descriptors_list (List[str]): List of desired mordred descriptors\n\n Returns:\n np.ndarray: Array holding the mordred moelcular descriptors.\n \"\"\"\n mols = [smiles2mol(smi) for smi in smiles]\n\n calc = Calculator(descriptors, ignore_3D=True)\n calc.descriptors = [d for d in calc.descriptors if str(d) in descriptors_list]\n\n descriptors_df = calc.pandas(mols)\n nan_list = [\n pd.to_numeric(descriptors_df[col], errors=\"coerce\").isnull().values.any()\n for col in descriptors_df.columns\n ]\n if any(nan_list):\n raise ValueError(\n f\"Found NaN values in descriptors {list(descriptors_df.columns[nan_list])}\"\n )\n\n return descriptors_df.astype(float).values\n
"},{"location":"ref-domain-util/#bofire.utils.doe","title":"doe
","text":""},{"location":"ref-domain-util/#bofire.utils.doe.ff2n","title":"ff2n(n_factors)
","text":"Computes the full factorial design for a given number of factors.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredReturns:
Type Descriptionndarray
The full factorial design.
Source code inbofire/utils/doe.py
def ff2n(n_factors: int) -> np.ndarray:\n \"\"\"Computes the full factorial design for a given number of factors.\n\n Args:\n n_factors: The number of factors.\n\n Returns:\n The full factorial design.\n \"\"\"\n return np.array(list(itertools.product([-1, 1], repeat=n_factors)))\n
"},{"location":"ref-domain-util/#bofire.utils.doe.fracfact","title":"fracfact(gen)
","text":"Computes the fractional factorial design for a given generator.
Parameters:
Name Type Description Defaultgen
The generator.
requiredReturns:
Type Descriptionndarray
The fractional factorial design.
Source code inbofire/utils/doe.py
def fracfact(gen) -> np.ndarray:\n \"\"\"Computes the fractional factorial design for a given generator.\n\n Args:\n gen: The generator.\n\n Returns:\n The fractional factorial design.\n \"\"\"\n gen = validate_generator(n_factors=gen.count(\" \") + 1, generator=gen)\n\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", gen) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # Check if there are \"-\" operators in gen\n idx_negative = [\n i for i, item in enumerate(gen.split(\" \")) if item[0] == \"-\"\n ] # remove empty strings\n\n # Fill in design with two level factorial design\n H1 = ff2n(len(idx_main))\n H = np.zeros((H1.shape[0], len(lengthes)))\n H[:, idx_main] = H1\n\n # Recognize combinations and fill in the rest of matrix H2 with the proper\n # products\n for k in idx_combi:\n # For lowercase letters\n xx = np.array([ord(c) for c in generators[k]]) - 97\n\n H[:, k] = np.prod(H1[:, xx], axis=1)\n\n # Update design if gen includes \"-\" operator\n if len(idx_negative) > 0:\n H[:, idx_negative] *= -1\n\n # Return the fractional factorial design\n return H\n
"},{"location":"ref-domain-util/#bofire.utils.doe.get_alias_structure","title":"get_alias_structure(gen, order=4)
","text":"Computes the alias structure of the design matrix. Works only for generators with positive signs.
Parameters:
Name Type Description Defaultgen
str
The generator.
requiredorder
int
The order up to wich the alias structure should be calculated. Defaults to 4.
4
Returns:
Type DescriptionList[str]
The alias structure of the design matrix.
Source code inbofire/utils/doe.py
def get_alias_structure(gen: str, order: int = 4) -> List[str]:\n \"\"\"Computes the alias structure of the design matrix. Works only for generators\n with positive signs.\n\n Args:\n gen: The generator.\n order: The order up to wich the alias structure should be calculated. Defaults to 4.\n\n Returns:\n The alias structure of the design matrix.\n \"\"\"\n design = fracfact(gen)\n\n n_experiments, n_factors = design.shape\n\n all_names = string.ascii_lowercase + \"I\"\n factors = range(n_factors)\n all_combinations = itertools.chain.from_iterable(\n (\n itertools.combinations(factors, n)\n for n in range(1, min(n_factors, order) + 1)\n )\n )\n aliases = {n_experiments * \"+\": [(26,)]} # 26 is mapped to I\n\n for combination in all_combinations:\n # positive sign\n contrast = np.prod(\n design[:, combination], axis=1\n ) # this is the product of the combination\n scontrast = \"\".join(np.where(contrast == 1, \"+\", \"-\").tolist())\n aliases[scontrast] = aliases.get(scontrast, [])\n aliases[scontrast].append(combination) # type: ignore\n\n aliases_list = []\n for alias in aliases.values():\n aliases_list.append(\n sorted(alias, key=lambda a: (len(a), a))\n ) # sort by length and then by the combination\n aliases_list = sorted(\n aliases_list, key=lambda list: ([len(a) for a in list], list)\n ) # sort by the length of the alias\n\n aliases_readable = []\n\n for alias in aliases_list:\n aliases_readable.append(\n \" = \".join([\"\".join([all_names[f] for f in a]) for a in alias])\n )\n\n return aliases_readable\n
"},{"location":"ref-domain-util/#bofire.utils.doe.get_confounding_matrix","title":"get_confounding_matrix(inputs, design, powers=None, interactions=None)
","text":"Analyzes the confounding of a design and returns the confounding matrix.
Only takes continuous features into account.
Parameters:
Name Type Description Defaultinputs
Inputs
Input features.
requireddesign
pd.DataFrame
Design matrix.
requiredpowers
List[int]
List of powers of the individual factors/features that should be considered. Integers has to be larger than 1. Defaults to [].
None
interactions
List[int]
List with interaction levels to be considered. Integers has to be larger than 1. Defaults to [2].
None
Returns:
Type Description_type_
description
Source code inbofire/utils/doe.py
def get_confounding_matrix(\n inputs: Inputs,\n design: pd.DataFrame,\n powers: Optional[List[int]] = None,\n interactions: Optional[List[int]] = None,\n):\n \"\"\"Analyzes the confounding of a design and returns the confounding matrix.\n\n Only takes continuous features into account.\n\n Args:\n inputs (Inputs): Input features.\n design (pd.DataFrame): Design matrix.\n powers (List[int], optional): List of powers of the individual factors/features that should be considered.\n Integers has to be larger than 1. Defaults to [].\n interactions (List[int], optional): List with interaction levels to be considered.\n Integers has to be larger than 1. Defaults to [2].\n\n Returns:\n _type_: _description_\n \"\"\"\n from sklearn.preprocessing import MinMaxScaler\n\n if len(inputs.get(CategoricalInput)) > 0:\n warnings.warn(\"Categorical input features will be ignored.\")\n\n keys = inputs.get_keys(ContinuousInput)\n scaler = MinMaxScaler(feature_range=(-1, 1))\n scaled_design = pd.DataFrame(\n data=scaler.fit_transform(design[keys]),\n columns=keys,\n )\n\n # add powers\n if powers is not None:\n for p in powers:\n assert p > 1, \"Power has to be at least of degree two.\"\n for key in keys:\n scaled_design[f\"{key}**{p}\"] = scaled_design[key] ** p\n\n # add interactions\n if interactions is None:\n interactions = [2]\n\n for i in interactions:\n assert i > 1, \"Interaction has to be at least of degree two.\"\n assert i < len(keys) + 1, f\"Interaction has to be smaller than {len(keys)+1}.\"\n for combi in itertools.combinations(keys, i):\n scaled_design[\":\".join(combi)] = scaled_design[list(combi)].prod(axis=1)\n\n return scaled_design.corr()\n
"},{"location":"ref-domain-util/#bofire.utils.doe.get_generator","title":"get_generator(n_factors, n_generators)
","text":"Computes a generator for a given number of factors and generators.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredn_generators
int
The number of generators.
requiredReturns:
Type Descriptionstr
The generator.
Source code inbofire/utils/doe.py
def get_generator(n_factors: int, n_generators: int) -> str:\n \"\"\"Computes a generator for a given number of factors and generators.\n\n Args:\n n_factors: The number of factors.\n n_generators: The number of generators.\n\n Returns:\n The generator.\n \"\"\"\n if n_generators == 0:\n return \" \".join(list(string.ascii_lowercase[:n_factors]))\n n_base_factors = n_factors - n_generators\n if n_generators == 1:\n if n_base_factors == 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(\n list(string.ascii_lowercase[:n_base_factors])\n + [string.ascii_lowercase[:n_base_factors]]\n )\n n_base_factors = n_factors - n_generators\n if n_base_factors - 1 < 2:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n generators = [\n \"\".join(i)\n for i in (\n itertools.combinations(\n string.ascii_lowercase[:n_base_factors], n_base_factors - 1\n )\n )\n ]\n if len(generators) > n_generators:\n generators = generators[:n_generators]\n elif (n_generators - len(generators) == 1) and (n_base_factors > 1):\n generators += [string.ascii_lowercase[:n_base_factors]]\n elif n_generators - len(generators) >= 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(list(string.ascii_lowercase[:n_base_factors]) + generators)\n
"},{"location":"ref-domain-util/#bofire.utils.doe.validate_generator","title":"validate_generator(n_factors, generator)
","text":"Validates the generator and thows an error if it is not valid.
Source code inbofire/utils/doe.py
def validate_generator(n_factors: int, generator: str) -> str:\n \"\"\"Validates the generator and thows an error if it is not valid.\"\"\"\n\n if len(generator.split(\" \")) != n_factors:\n raise ValueError(\"Generator does not match the number of factors.\")\n # clean it and transform it into a list\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", generator) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n if len(idx_main) == 0:\n raise ValueError(\"At least one unconfounded main factor is needed.\")\n\n # Check that single letters (main factors) are unique\n if len(idx_main) != len({generators[i] for i in idx_main}):\n raise ValueError(\"Main factors are confounded with each other.\")\n\n # Check that single letters (main factors) follow the alphabet\n if (\n \"\".join(sorted([generators[i] for i in idx_main]))\n != string.ascii_lowercase[: len(idx_main)]\n ):\n raise ValueError(\n f'Use the letters `{\" \".join(string.ascii_lowercase[: len(idx_main)])}` for the main factors.'\n )\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # check that main factors come before combinations\n if min(idx_combi) > max(idx_main):\n raise ValueError(\"Main factors have to come before combinations.\")\n\n # Check that letter combinations are unique\n if len(idx_combi) != len({generators[i] for i in idx_combi}):\n raise ValueError(\"Generators are not unique.\")\n\n # Check that only letters are used in the combinations that are also single letters (main factors)\n if not all(\n set(item).issubset({generators[i] for i in idx_main})\n for item in [generators[i] for i in idx_combi]\n ):\n raise ValueError(\"Generators are not valid.\")\n\n return generator\n
"},{"location":"ref-domain-util/#bofire.utils.multiobjective","title":"multiobjective
","text":""},{"location":"ref-domain-util/#bofire.utils.multiobjective.get_ref_point_mask","title":"get_ref_point_mask(domain, output_feature_keys=None)
","text":"Method to get a mask for the reference points taking into account if we want to maximize or minimize an objective. In case it is maximize the value in the mask is 1, in case we want to minimize it is -1.
Parameters:
Name Type Description Defaultdomain
Domain
Domain for which the mask should be generated.
requiredoutput_feature_keys
Optional[list]
Name of output feature keys that should be considered in the mask. Defaults to None.
None
Returns:
Type Descriptionnp.ndarray
description
Source code inbofire/utils/multiobjective.py
def get_ref_point_mask(\n domain: Domain, output_feature_keys: Optional[list] = None\n) -> np.ndarray:\n \"\"\"Method to get a mask for the reference points taking into account if we\n want to maximize or minimize an objective. In case it is maximize the value\n in the mask is 1, in case we want to minimize it is -1.\n\n Args:\n domain (Domain): Domain for which the mask should be generated.\n output_feature_keys (Optional[list], optional): Name of output feature keys\n that should be considered in the mask. Defaults to None.\n\n Returns:\n np.ndarray: _description_\n \"\"\"\n if output_feature_keys is None:\n output_feature_keys = domain.outputs.get_keys_by_objective(\n includes=[MaximizeObjective, MinimizeObjective, CloseToTargetObjective]\n )\n if len(output_feature_keys) < 2:\n raise ValueError(\"At least two output features have to be provided.\")\n mask = []\n for key in output_feature_keys:\n feat = domain.outputs.get_by_key(key)\n if isinstance(feat.objective, MaximizeObjective): # type: ignore\n mask.append(1.0)\n elif isinstance(feat.objective, MinimizeObjective): # type: ignore\n mask.append(-1.0)\n elif isinstance(feat.objective, CloseToTargetObjective): # type: ignore\n mask.append(-1.0)\n else:\n raise ValueError(\n \"Only `MaximizeObjective` and `MinimizeObjective` supported\"\n )\n return np.array(mask)\n
"},{"location":"ref-domain-util/#bofire.utils.naming_conventions","title":"naming_conventions
","text":""},{"location":"ref-domain-util/#bofire.utils.naming_conventions.get_column_names","title":"get_column_names(outputs)
","text":"Specifies column names for given Outputs type.
Parameters:
Name Type Description Defaultoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type DescriptionTuple[List[str], List[str]]
A tuple containing the prediction column names and the standard deviation column names
Source code inbofire/utils/naming_conventions.py
def get_column_names(outputs: Outputs) -> Tuple[List[str], List[str]]:\n \"\"\"\n Specifies column names for given Outputs type.\n\n Args:\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n Tuple[List[str], List[str]]: A tuple containing the prediction column names and the standard deviation column names\n \"\"\"\n pred_cols, sd_cols = [], []\n for featkey in outputs.get_keys(CategoricalOutput): # type: ignore\n pred_cols = pred_cols + [\n f\"{featkey}_{cat}_prob\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n sd_cols = sd_cols + [\n f\"{featkey}_{cat}_sd\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n for featkey in outputs.get_keys(ContinuousOutput): # type: ignore\n pred_cols = pred_cols + [f\"{featkey}_pred\"]\n sd_cols = sd_cols + [f\"{featkey}_sd\"]\n\n return pred_cols, sd_cols\n
"},{"location":"ref-domain-util/#bofire.utils.naming_conventions.postprocess_categorical_predictions","title":"postprocess_categorical_predictions(predictions, outputs)
","text":"Postprocess categorical predictions by finding the maximum probability location
Parameters:
Name Type Description Defaultpredictions
pd.DataFrame
The dataframe containing the predictions.
requiredoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type Descriptionpredictions (pd.DataFrame)
The (potentially modified) original dataframe with categorical predictions added
Source code inbofire/utils/naming_conventions.py
def postprocess_categorical_predictions(predictions: pd.DataFrame, outputs: Outputs) -> pd.DataFrame: # type: ignore\n \"\"\"\n Postprocess categorical predictions by finding the maximum probability location\n\n Args:\n predictions (pd.DataFrame): The dataframe containing the predictions.\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n predictions (pd.DataFrame): The (potentially modified) original dataframe with categorical predictions added\n \"\"\"\n for feat in outputs.get():\n if isinstance(feat, CategoricalOutput): # type: ignore\n predictions.insert(\n loc=0,\n column=f\"{feat.key}_pred\",\n value=predictions.filter(regex=f\"{feat.key}(.*)_prob\")\n .idxmax(1)\n .str.replace(f\"{feat.key}_\", \"\")\n .str.replace(\"_prob\", \"\")\n .values,\n )\n predictions.insert(\n loc=1,\n column=f\"{feat.key}_sd\",\n value=0.0,\n )\n return predictions\n
"},{"location":"ref-domain-util/#bofire.utils.reduce","title":"reduce
","text":""},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform","title":" AffineTransform
","text":"Class to switch back and forth from the reduced to the original domain.
Source code inbofire/utils/reduce.py
class AffineTransform:\n \"\"\"Class to switch back and forth from the reduced to the original domain.\"\"\"\n\n def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n\n def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n\n def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform.__init__","title":"__init__(self, equalities)
special
","text":"Initializes a AffineTransformation
object.
Parameters:
Name Type Description Defaultequalities
List[Tuple[str,List[str],List[float]]]
List of equalities. Every equality is defined as a tuple, in which the first entry is the key of the reduced feature, the second one is a list of feature keys that can be used to compute the feature and the third list of floats are the corresponding coefficients.
required Source code inbofire/utils/reduce.py
def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform.augment_data","title":"augment_data(self, data)
","text":"Restore the eliminated features in a dataframe
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe that should be restored.
requiredReturns:
Type Descriptionpd.DataFrame
Restored dataframe
Source code inbofire/utils/reduce.py
def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform.drop_data","title":"drop_data(self, data)
","text":"Drop eliminated features from a dataframe.
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe with features to be dropped.
requiredReturns:
Type Descriptionpd.DataFrame
Reduced dataframe.
Source code inbofire/utils/reduce.py
def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.adjust_boundary","title":"adjust_boundary(feature, coef, rhs)
","text":"Adjusts the boundaries of a feature.
Parameters:
Name Type Description Defaultfeature
ContinuousInput
Feature to be adjusted.
requiredcoef
float
Coefficient.
requiredrhs
float
Right-hand-side of the constraint.
required Source code inbofire/utils/reduce.py
def adjust_boundary(feature: ContinuousInput, coef: float, rhs: float):\n \"\"\"Adjusts the boundaries of a feature.\n\n Args:\n feature (ContinuousInput): Feature to be adjusted.\n coef (float): Coefficient.\n rhs (float): Right-hand-side of the constraint.\n \"\"\"\n boundary = rhs / coef\n if coef > 0:\n if boundary > feature.lower_bound:\n feature.bounds = (boundary, feature.upper_bound)\n else:\n if boundary < feature.upper_bound:\n feature.bounds = (feature.lower_bound, boundary)\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.check_domain_for_reduction","title":"check_domain_for_reduction(domain)
","text":"Check if the reduction can be applied or if a trivial case is present.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be checked.
requiredReturns:
Type Descriptionbool
True if reducable, else False.
Source code inbofire/utils/reduce.py
def check_domain_for_reduction(domain: Domain) -> bool:\n \"\"\"Check if the reduction can be applied or if a trivial case is present.\n\n Args:\n domain (Domain): Domain to be checked.\n Returns:\n bool: True if reducable, else False.\n \"\"\"\n # are there any constraints?\n if len(domain.constraints) == 0:\n return False\n\n # are there any linear equality constraints?\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n if len(linear_equalities) == 0:\n return False\n\n # are there no NChooseKConstraint constraints?\n if len(domain.constraints.get([NChooseKConstraint])) > 0:\n return False\n\n # are there continuous inputs\n continuous_inputs = domain.inputs.get(ContinuousInput)\n if len(continuous_inputs) == 0:\n return False\n\n # check that equality constraints only contain continuous inputs\n for c in linear_equalities:\n assert isinstance(c, LinearConstraint)\n for feat in c.features:\n if feat not in domain.inputs.get_keys(ContinuousInput):\n return False\n return True\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.check_existence_of_solution","title":"check_existence_of_solution(A_aug)
","text":"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.
Source code inbofire/utils/reduce.py
def check_existence_of_solution(A_aug):\n \"\"\"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.\"\"\"\n A = A_aug[:, :-1]\n b = A_aug[:, -1]\n len_inputs = np.shape(A)[1]\n\n # catch special cases\n rk_A_aug = np.linalg.matrix_rank(A_aug)\n rk_A = np.linalg.matrix_rank(A)\n\n if rk_A == rk_A_aug:\n if rk_A < len_inputs:\n return # all good\n else:\n x = np.linalg.solve(A, b)\n raise Exception(\n f\"There is a unique solution x for the linear equality constraints: x={x}\"\n )\n elif rk_A < rk_A_aug:\n raise Exception(\n \"There is no solution fulfilling the linear equality constraints.\"\n )\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.reduce_domain","title":"reduce_domain(domain)
","text":"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be reduced.
requiredReturns:
Type DescriptionTuple[Domain, AffineTransform]
reduced domain and the according transformation to switch between the reduced and orginal domain.
Source code inbofire/utils/reduce.py
def reduce_domain(domain: Domain) -> Tuple[Domain, AffineTransform]:\n \"\"\"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.\n\n Args:\n domain (Domain): Domain to be reduced.\n\n Returns:\n Tuple[Domain, AffineTransform]: reduced domain and the according transformation to switch between the\n reduced and orginal domain.\n \"\"\"\n # check if the domain can be reduced\n if not check_domain_for_reduction(domain):\n return domain, AffineTransform([])\n\n # find linear equality constraints\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n other_constraints = domain.constraints.get(\n Constraint, excludes=[LinearEqualityConstraint]\n )\n\n # only consider continuous inputs\n continuous_inputs = [\n cast(ContinuousInput, f) for f in domain.inputs.get(ContinuousInput)\n ]\n other_inputs = domain.inputs.get(Input, excludes=[ContinuousInput])\n\n # assemble Matrix A from equality constraints\n N = len(linear_equalities)\n M = len(continuous_inputs) + 1\n names = np.concatenate(([feat.key for feat in continuous_inputs], [\"rhs\"]))\n\n A_aug = pd.DataFrame(data=np.zeros(shape=(N, M)), columns=names)\n\n for i in range(len(linear_equalities)):\n c = linear_equalities[i]\n assert isinstance(c, LinearEqualityConstraint)\n A_aug.loc[i, c.features] = c.coefficients # type: ignore\n A_aug.loc[i, \"rhs\"] = c.rhs\n A_aug = A_aug.values\n\n # catch special cases\n check_existence_of_solution(A_aug)\n\n # bring A_aug to reduced row-echelon form\n A_aug_rref, pivots = rref(A_aug)\n pivots = np.array(pivots)\n A_aug_rref = np.array(A_aug_rref).astype(np.float64)\n\n # formulate box bounds as linear inequality constraints in matrix form\n B = np.zeros(shape=(2 * (M - 1), M))\n B[: M - 1, : M - 1] = np.eye(M - 1)\n B[M - 1 :, : M - 1] = -np.eye(M - 1)\n\n B[: M - 1, -1] = np.array([feat.upper_bound for feat in continuous_inputs])\n B[M - 1 :, -1] = -1.0 * np.array([feat.lower_bound for feat in continuous_inputs])\n\n # eliminate columns with pivot element\n for i in range(len(pivots)):\n p = pivots[i]\n B[p, :] -= A_aug_rref[i, :]\n B[p + M - 1, :] += A_aug_rref[i, :]\n\n # build up reduced domain\n _domain = Domain.model_construct(\n # _fields_set = {\"inputs\", \"outputs\", \"constraints\"}\n inputs=deepcopy(other_inputs),\n outputs=deepcopy(domain.outputs),\n constraints=deepcopy(other_constraints),\n )\n new_inputs = [\n deepcopy(feat) for i, feat in enumerate(continuous_inputs) if i not in pivots\n ]\n all_inputs = _domain.inputs + new_inputs\n assert isinstance(all_inputs, Inputs)\n _domain.inputs.features = all_inputs.features\n\n constraints: List[AnyConstraint] = []\n for i in pivots:\n # reduce equation system of upper bounds\n ind = np.where(B[i, :-1] != 0)[0]\n if len(ind) > 0 and B[i, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i, ind]).tolist(),\n rhs=B[i, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(feat, (-1.0 * B[i, ind])[0], B[i, -1] * -1.0)\n else:\n if B[i, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n # reduce equation system of lower bounds\n ind = np.where(B[i + M - 1, :-1] != 0)[0]\n if len(ind) > 0 and B[i + M - 1, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i + M - 1, ind]).tolist(),\n rhs=B[i + M - 1, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(\n feat,\n (-1.0 * B[i + M - 1, ind])[0],\n B[i + M - 1, -1] * -1.0,\n )\n else:\n if B[i + M - 1, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n if len(constraints) > 0:\n _domain.constraints.constraints = _domain.constraints.constraints + constraints # type: ignore\n\n # assemble equalities\n _equalities = []\n for i in range(len(pivots)):\n name_lhs = names[pivots[i]]\n names_rhs = []\n coeffs = []\n\n for j in range(len(names) - 1):\n if A_aug_rref[i, j] != 0 and j != pivots[i]:\n coeffs.append(-A_aug_rref[i, j])\n names_rhs.append(names[j])\n\n coeffs.append(A_aug_rref[i, -1])\n\n _equalities.append((name_lhs, names_rhs, coeffs))\n\n trafo = AffineTransform(_equalities)\n # remove remaining dependencies of eliminated inputs from the problem\n _domain = remove_eliminated_inputs(_domain, trafo)\n return _domain, trafo\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.remove_eliminated_inputs","title":"remove_eliminated_inputs(domain, transform)
","text":"Eliminates remaining occurences of eliminated inputs in linear constraints.
Parameters:
Name Type Description Defaultdomain
Domain
Domain in which the linear constraints should be purged.
requiredtransform
AffineTransform
Affine transformation object that defines the obsolete features.
requiredExceptions:
Type DescriptionValueError
If feature occurs in a constraint different from a linear one.
Returns:
Type DescriptionDomain
Purged domain.
Source code inbofire/utils/reduce.py
def remove_eliminated_inputs(domain: Domain, transform: AffineTransform) -> Domain:\n \"\"\"Eliminates remaining occurences of eliminated inputs in linear constraints.\n\n Args:\n domain (Domain): Domain in which the linear constraints should be purged.\n transform (AffineTransform): Affine transformation object that defines the obsolete features.\n\n Raises:\n ValueError: If feature occurs in a constraint different from a linear one.\n\n Returns:\n Domain: Purged domain.\n \"\"\"\n inputs_names = domain.inputs.get_keys()\n M = len(inputs_names)\n\n # write the equalities for the backtransformation into one matrix\n inputs_dict = {inputs_names[i]: i for i in range(M)}\n\n # build up dict from domain.equalities e.g. {\"xi1\": [coeff(xj1), ..., coeff(xjn)], ... \"xik\":...}\n coeffs_dict = {}\n for e in transform.equalities:\n coeffs = np.zeros(M + 1)\n for j, name in enumerate(e[1]):\n coeffs[inputs_dict[name]] = e[2][j]\n coeffs[-1] = e[2][-1]\n coeffs_dict[e[0]] = coeffs\n\n constraints = []\n for c in domain.constraints.get():\n # Nonlinear constraints not supported\n if not isinstance(c, LinearConstraint):\n raise ValueError(\n \"Elimination of variables is only supported for LinearEquality and LinearInequality constraints.\"\n )\n\n # no changes, if the constraint does not contain eliminated inputs\n elif all(name in inputs_names for name in c.features):\n constraints.append(c)\n\n # remove inputs from the constraint that were eliminated from the inputs before\n else:\n totally_removed = False\n _features = np.array(inputs_names)\n _rhs = c.rhs\n\n # create new lhs and rhs from the old one and knowledge from problem._equalities\n _coefficients = np.zeros(M)\n for j, name in enumerate(c.features):\n if name in inputs_names:\n _coefficients[inputs_dict[name]] += c.coefficients[j]\n else:\n _coefficients += c.coefficients[j] * coeffs_dict[name][:-1]\n _rhs -= c.coefficients[j] * coeffs_dict[name][-1]\n\n _features = _features[np.abs(_coefficients) > 1e-16]\n _coefficients = _coefficients[np.abs(_coefficients) > 1e-16]\n _c = None\n if isinstance(c, LinearEqualityConstraint):\n if len(_features) > 1:\n _c = LinearEqualityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat: ContinuousInput = ContinuousInput(\n **domain.inputs.get_by_key(_features[0]).model_dump()\n )\n feat.bounds = (_coefficients[0], _coefficients[0])\n totally_removed = True\n else:\n if len(_features) > 1:\n _c = LinearInequalityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat = cast(ContinuousInput, domain.inputs.get_by_key(_features[0]))\n adjust_boundary(feat, _coefficients[0], _rhs)\n totally_removed = True\n\n # check if constraint is always fulfilled/not fulfilled\n if not totally_removed:\n assert _c is not None\n if len(_c.features) == 0 and _c.rhs >= 0:\n pass\n elif len(_c.features) == 0 and _c.rhs < 0:\n raise Exception(\"Linear constraints cannot be fulfilled.\")\n elif np.isinf(_c.rhs):\n pass\n else:\n constraints.append(_c)\n domain.constraints = Constraints(constraints=constraints)\n return domain\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.rref","title":"rref(A, tol=1e-08)
","text":"Computes the reduced row echelon form of a Matrix
Parameters:
Name Type Description DefaultA
ndarray
2d array representing a matrix.
requiredtol
float
tolerance for rounding to 0. Defaults to 1e-8.
1e-08
Returns:
Type DescriptionTuple[numpy.ndarray, List[int]]
(A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots is a numpy array containing the pivot columns of A_rref
Source code inbofire/utils/reduce.py
def rref(A: np.ndarray, tol: float = 1e-8) -> Tuple[np.ndarray, List[int]]:\n \"\"\"Computes the reduced row echelon form of a Matrix\n\n Args:\n A (ndarray): 2d array representing a matrix.\n tol (float, optional): tolerance for rounding to 0. Defaults to 1e-8.\n\n Returns:\n (A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots\n is a numpy array containing the pivot columns of A_rref\n \"\"\"\n A = np.array(A, dtype=np.float64)\n n, m = np.shape(A)\n\n col = 0\n row = 0\n pivots = []\n\n for col in range(m):\n # does a pivot element exist?\n if all(np.abs(A[row:, col]) < tol):\n pass\n # if yes: start elimination\n else:\n pivots.append(col)\n max_row = np.argmax(np.abs(A[row:, col])) + row\n # switch to most stable row\n A[[row, max_row], :] = A[[max_row, row], :] # type: ignore\n # normalize row\n A[row, :] /= A[row, col]\n # eliminate other elements from column\n for r in range(n):\n if r != row:\n A[r, :] -= A[r, col] / A[row, col] * A[row, :]\n row += 1\n\n prec = int(-np.log10(tol))\n return np.round(A, prec), pivots\n
"},{"location":"ref-domain-util/#bofire.utils.subdomain","title":"subdomain
","text":""},{"location":"ref-domain-util/#bofire.utils.subdomain.get_subdomain","title":"get_subdomain(domain, feature_keys)
","text":"removes all features not defined as argument creating a subdomain of the provided domain
Parameters:
Name Type Description Defaultdomain
Domain
the original domain wherefrom a subdomain should be created
requiredfeature_keys
List
List of features that shall be included in the subdomain
requiredExceptions:
Type DescriptionAssert
when in total less than 2 features are provided
ValueError
when a provided feature key is not present in the provided domain
Assert
when no output feature is provided
Assert
when no input feature is provided
ValueError
description
Returns:
Type DescriptionDomain
A new domain containing only parts of the original domain
Source code inbofire/utils/subdomain.py
def get_subdomain(\n domain: Domain,\n feature_keys: List,\n) -> Domain:\n \"\"\"removes all features not defined as argument creating a subdomain of the provided domain\n\n Args:\n domain (Domain): the original domain wherefrom a subdomain should be created\n feature_keys (List): List of features that shall be included in the subdomain\n\n Raises:\n Assert: when in total less than 2 features are provided\n ValueError: when a provided feature key is not present in the provided domain\n Assert: when no output feature is provided\n Assert: when no input feature is provided\n ValueError: _description_\n\n Returns:\n Domain: A new domain containing only parts of the original domain\n \"\"\"\n assert len(feature_keys) >= 2, \"At least two features have to be provided.\"\n outputs = []\n inputs = []\n for key in feature_keys:\n try:\n feat = (domain.inputs + domain.outputs).get_by_key(key)\n except KeyError:\n raise ValueError(f\"Feature {key} not present in domain.\")\n if isinstance(feat, Input):\n inputs.append(feat)\n else:\n outputs.append(feat)\n assert len(outputs) > 0, \"At least one output feature has to be provided.\"\n assert len(inputs) > 0, \"At least one input feature has to be provided.\"\n inputs = Inputs(features=inputs)\n outputs = Outputs(features=outputs)\n # loop over constraints and make sure that all features used in constraints are in the input_feature_keys\n for c in domain.constraints:\n for key in c.features: # type: ignore\n if key not in inputs.get_keys():\n raise ValueError(\n f\"Removed input feature {key} is used in a constraint.\"\n )\n subdomain = deepcopy(domain)\n subdomain.inputs = inputs\n subdomain.outputs = outputs\n return subdomain\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools","title":"torch_tools
","text":""},{"location":"ref-domain-util/#bofire.utils.torch_tools.constrained_objective2botorch","title":"constrained_objective2botorch(idx, objective, eps=1e-08)
","text":"Create a callable that can be used by botorch.utils.objective.apply_constraints
to setup ouput constrained optimizations.
Parameters:
Name Type Description Defaultidx
int
Index of the constraint objective in the list of outputs.
requiredobjective
BotorchConstrainedObjective
The objective that should be transformed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float], int]
List of callables that can be used by botorch for setting up the constrained objective, list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)
Source code inbofire/utils/torch_tools.py
def constrained_objective2botorch(\n idx: int, objective: ConstrainedObjective, eps: float = 1e-8\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float], int]:\n \"\"\"Create a callable that can be used by `botorch.utils.objective.apply_constraints`\n to setup ouput constrained optimizations.\n\n Args:\n idx (int): Index of the constraint objective in the list of outputs.\n objective (BotorchConstrainedObjective): The objective that should be transformed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float], int]: List of callables that can be used by botorch for setting up the constrained objective,\n list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)\n \"\"\"\n assert isinstance(\n objective, ConstrainedObjective\n ), \"Objective is not a `ConstrainedObjective`.\"\n if isinstance(objective, MaximizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp) * -1.0],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, MinimizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp)],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, TargetObjective):\n return (\n [\n lambda Z: (Z[..., idx] - (objective.target_value - objective.tolerance))\n * -1.0,\n lambda Z: (\n Z[..., idx] - (objective.target_value + objective.tolerance)\n ),\n ],\n [1.0 / objective.steepness, 1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, ConstrainedCategoricalObjective):\n # The output of a categorical objective has final dim `c` where `c` is number of classes\n # Pass in the expected acceptance probability and perform an inverse sigmoid to atain the original probabilities\n return (\n [\n lambda Z: torch.log(\n 1\n / torch.clamp(\n (\n Z[..., idx : idx + len(objective.desirability)]\n * torch.tensor(objective.desirability).to(**tkwargs)\n ).sum(-1),\n min=eps,\n max=1 - eps,\n )\n - 1,\n )\n ],\n [1.0],\n idx + len(objective.desirability),\n )\n else:\n raise ValueError(f\"Objective {objective.__class__.__name__} not known.\")\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_initial_conditions_generator","title":"get_initial_conditions_generator(strategy, transform_specs, ask_options=None, sequential=True)
","text":"Takes a strategy object and returns a callable which uses this strategy to return a generator callable which can be used in botorchs
gen_batch_initial_conditions` to generate samples.
Parameters:
Name Type Description Defaultstrategy
Strategy
Strategy that should be used to generate samples.
requiredtransform_specs
Dict
Dictionary indicating how the samples should be transformed.
requiredask_options
Dict
Dictionary of keyword arguments that are passed to the ask
method of the strategy. Defaults to {}.
None
sequential
bool
If True, samples for every q-batch are generate indepenent from each other. If False, the n x q
samples are generated at once.
True
Returns:
Type DescriptionCallable[[int, int, int], Tensor]
Callable that can be passed to batch_initial_conditions
.
bofire/utils/torch_tools.py
def get_initial_conditions_generator(\n strategy: Strategy,\n transform_specs: Dict,\n ask_options: Optional[Dict] = None,\n sequential: bool = True,\n) -> Callable[[int, int, int], Tensor]:\n \"\"\"Takes a strategy object and returns a callable which uses this\n strategy to return a generator callable which can be used in botorch`s\n `gen_batch_initial_conditions` to generate samples.\n\n Args:\n strategy (Strategy): Strategy that should be used to generate samples.\n transform_specs (Dict): Dictionary indicating how the samples should be\n transformed.\n ask_options (Dict, optional): Dictionary of keyword arguments that are\n passed to the `ask` method of the strategy. Defaults to {}.\n sequential (bool, optional): If True, samples for every q-batch are\n generate indepenent from each other. If False, the `n x q` samples\n are generated at once.\n\n Returns:\n Callable[[int, int, int], Tensor]: Callable that can be passed to\n `batch_initial_conditions`.\n \"\"\"\n if ask_options is None:\n ask_options = {}\n\n def generator(n: int, q: int, seed: int) -> Tensor:\n if sequential:\n initial_conditions = []\n for _ in range(n):\n candidates = strategy.ask(q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n # transform to tensor\n initial_conditions.append(\n torch.from_numpy(transformed_candidates.values).to(**tkwargs)\n )\n return torch.stack(initial_conditions, dim=0)\n else:\n candidates = strategy.ask(n * q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n return (\n torch.from_numpy(transformed_candidates.values)\n .to(**tkwargs)\n .reshape(n, q, transformed_candidates.shape[1])\n )\n\n return generator\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_interpoint_constraints","title":"get_interpoint_constraints(domain, n_candidates)
","text":"Converts interpoint equality constraints to linear equality constraints, that can be processed by botorch. For more information, see the docstring of optimize_acqf
in botorch (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredn_candidates
int
Number of candidates that should be requested.
requiredReturns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_interpoint_constraints(\n domain: Domain, n_candidates: int\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts interpoint equality constraints to linear equality constraints,\n that can be processed by botorch. For more information, see the docstring\n of `optimize_acqf` in botorch\n (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).\n\n Args:\n domain (Domain): Optimization problem definition.\n n_candidates (int): Number of candidates that should be requested.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists\n of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for constraint in domain.constraints.get(InterpointEqualityConstraint):\n assert isinstance(constraint, InterpointEqualityConstraint)\n coefficients = torch.tensor([1.0, -1.0]).to(**tkwargs)\n feat_idx = domain.inputs.get_keys(Input).index(constraint.feature)\n feat = domain.inputs.get_by_key(constraint.feature)\n assert isinstance(feat, ContinuousInput)\n if feat.is_fixed():\n continue\n multiplicity = constraint.multiplicity or n_candidates\n for i in range(math.ceil(n_candidates / multiplicity)):\n all_indices = torch.arange(\n i * multiplicity, min((i + 1) * multiplicity, n_candidates)\n )\n for k in range(len(all_indices) - 1):\n indices = torch.tensor(\n [[all_indices[0], feat_idx], [all_indices[k + 1], feat_idx]],\n dtype=torch.int64,\n )\n constraints.append((indices, coefficients, 0.0))\n return constraints\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_linear_constraints","title":"get_linear_constraints(domain, constraint, unit_scaled=False)
","text":"Converts linear constraints to the form required by BoTorch.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredconstraint
Union[Type[bofire.data_models.constraints.linear.LinearEqualityConstraint], Type[bofire.data_models.constraints.linear.LinearInequalityConstraint]]
Type of constraint that should be converted.
requiredunit_scaled
bool
If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.
False
Returns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_linear_constraints(\n domain: Domain,\n constraint: Union[Type[LinearEqualityConstraint], Type[LinearInequalityConstraint]],\n unit_scaled: bool = False,\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts linear constraints to the form required by BoTorch.\n\n Args:\n domain: Optimization problem definition.\n constraint: Type of constraint that should be converted.\n unit_scaled: If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for c in domain.constraints.get(constraint):\n indices = []\n coefficients = []\n lower = []\n upper = []\n rhs = 0.0\n for i, featkey in enumerate(c.features): # type: ignore\n idx = domain.inputs.get_keys(Input).index(featkey)\n feat = domain.inputs.get_by_key(featkey)\n if feat.is_fixed(): # type: ignore\n rhs -= feat.fixed_value()[0] * c.coefficients[i] # type: ignore\n else:\n lower.append(feat.lower_bound) # type: ignore\n upper.append(feat.upper_bound) # type: ignore\n indices.append(idx)\n coefficients.append(\n c.coefficients[i] # type: ignore\n ) # if unit_scaled == False else c_scaled.coefficients[i])\n if unit_scaled:\n lower = np.array(lower)\n upper = np.array(upper)\n s = upper - lower\n scaled_coefficients = s * np.array(coefficients)\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(scaled_coefficients).to(**tkwargs),\n -(rhs + c.rhs - np.sum(np.array(coefficients) * lower)), # type: ignore\n )\n )\n else:\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(coefficients).to(**tkwargs),\n -(rhs + c.rhs), # type: ignore\n )\n )\n return constraints\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_multiobjective_objective","title":"get_multiobjective_objective(outputs)
","text":"Returns
Parameters:
Name Type Description Defaultoutputs
Outputs
description
requiredReturns:
Type DescriptionCallable[[Tensor], Tensor]
description
Source code inbofire/utils/torch_tools.py
def get_multiobjective_objective(\n outputs: Outputs,\n) -> Callable[[Tensor, Optional[Tensor]], Tensor]:\n \"\"\"Returns\n\n Args:\n outputs (Outputs): _description_\n\n Returns:\n Callable[[Tensor], Tensor]: _description_\n \"\"\"\n callables = [\n get_objective_callable(idx=i, objective=feat.objective) # type: ignore\n for i, feat in enumerate(outputs.get())\n if feat.objective is not None # type: ignore\n and isinstance(\n feat.objective, # type: ignore\n (MaximizeObjective, MinimizeObjective, CloseToTargetObjective),\n )\n ]\n\n def objective(samples: Tensor, X: Optional[Tensor] = None) -> Tensor:\n return torch.stack([c(samples, None) for c in callables], dim=-1)\n\n return objective\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_nchoosek_constraints","title":"get_nchoosek_constraints(domain)
","text":"Transforms NChooseK constraints into a list of non-linear inequality constraint callables that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered at zero.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
List of callables that can be used as nonlinear equality constraints in botorch.
Source code inbofire/utils/torch_tools.py
def get_nchoosek_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"Transforms NChooseK constraints into a list of non-linear inequality constraint callables\n that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously\n relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered\n at zero.\n\n Args:\n domain (Domain): Optimization problem definition.\n\n Returns:\n List[Callable[[Tensor], float]]: List of callables that can be used\n as nonlinear equality constraints in botorch.\n \"\"\"\n\n def narrow_gaussian(x, ell=1e-3):\n return torch.exp(-0.5 * (x / ell) ** 2)\n\n def max_constraint(indices: Tensor, num_features: int, max_count: int):\n return lambda x: narrow_gaussian(x=x[..., indices]).sum(dim=-1) - (\n num_features - max_count\n )\n\n def min_constraint(indices: Tensor, num_features: int, min_count: int):\n return lambda x: -narrow_gaussian(x=x[..., indices]).sum(dim=-1) + (\n num_features - min_count\n )\n\n constraints = []\n # ignore none also valid for the start\n for c in domain.constraints.get(NChooseKConstraint):\n assert isinstance(c, NChooseKConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n if c.max_count != len(c.features):\n constraints.append(\n max_constraint(\n indices=indices, num_features=len(c.features), max_count=c.max_count\n )\n )\n if c.min_count > 0:\n constraints.append(\n min_constraint(\n indices=indices, num_features=len(c.features), min_count=c.min_count\n )\n )\n return constraints\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_nonlinear_constraints","title":"get_nonlinear_constraints(domain)
","text":"Returns a list of callable functions that represent the nonlinear constraints for the given domain that can be processed by botorch.
Parameters:
Name Type Description Defaultdomain
Domain
The domain for which to generate the nonlinear constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of callable functions that take a tensor as input and return a float value representing the constraint evaluation.
Source code inbofire/utils/torch_tools.py
def get_nonlinear_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of callable functions that represent the nonlinear constraints\n for the given domain that can be processed by botorch.\n\n Parameters:\n domain (Domain): The domain for which to generate the nonlinear constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of callable functions that take a tensor\n as input and return a float value representing the constraint evaluation.\n \"\"\"\n return get_nchoosek_constraints(domain) + get_product_constraints(domain)\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_output_constraints","title":"get_output_constraints(outputs)
","text":"Method to translate output constraint objectives into a list of callables and list of etas for use in botorch.
Parameters:
Name Type Description Defaultoutputs
Outputs
Output feature object that should be processed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float]]
List of constraint callables, list of associated etas.
Source code inbofire/utils/torch_tools.py
def get_output_constraints(\n outputs: Outputs,\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float]]:\n \"\"\"Method to translate output constraint objectives into a list of\n callables and list of etas for use in botorch.\n\n Args:\n outputs (Outputs): Output feature object that should\n be processed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float]]: List of constraint callables,\n list of associated etas.\n \"\"\"\n constraints = []\n etas = []\n idx = 0\n for feat in outputs.get():\n if isinstance(feat.objective, ConstrainedObjective): # type: ignore\n iconstraints, ietas, idx = constrained_objective2botorch(\n idx,\n objective=feat.objective, # type: ignore\n )\n constraints += iconstraints\n etas += ietas\n else:\n idx += 1\n return constraints, etas\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_product_constraints","title":"get_product_constraints(domain)
","text":"Returns a list of nonlinear constraint functions that can be processed by botorch based on the given domain.
Parameters:
Name Type Description Defaultdomain
Domain
The domain object containing the constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of product constraint functions.
Source code inbofire/utils/torch_tools.py
def get_product_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of nonlinear constraint functions that can be processed by botorch\n based on the given domain.\n\n Args:\n domain (Domain): The domain object containing the constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of product constraint functions.\n\n \"\"\"\n\n def product_constraint(indices: Tensor, exponents: Tensor, rhs: float, sign: int):\n return lambda x: -1.0 * sign * (x[..., indices] ** exponents).prod(dim=-1) + rhs\n\n constraints = []\n for c in domain.constraints.get(ProductInequalityConstraint):\n assert isinstance(c, ProductInequalityConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n constraints.append(\n product_constraint(indices, torch.tensor(c.exponents), c.rhs, c.sign)\n )\n return constraints\n
"},{"location":"ref-domain/","title":"Domain","text":""},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain","title":" Domain (BaseModel)
","text":"Source code in bofire/data_models/domain/domain.py
class Domain(BaseModel):\n type: Literal[\"Domain\"] = \"Domain\"\n\n inputs: Inputs = Field(default_factory=lambda: Inputs())\n outputs: Outputs = Field(default_factory=lambda: Outputs())\n constraints: Constraints = Field(default_factory=lambda: Constraints())\n\n \"\"\"Representation of the optimization problem/domain\n\n Attributes:\n inputs (List[Input], optional): List of input features. Defaults to [].\n outputs (List[Output], optional): List of output features. Defaults to [].\n constraints (List[Constraint], optional): List of constraints. Defaults to [].\n \"\"\"\n\n @classmethod\n def from_lists(\n cls,\n inputs: Optional[Sequence[AnyInput]] = None,\n outputs: Optional[Sequence[AnyOutput]] = None,\n constraints: Optional[Sequence[AnyConstraint]] = None,\n ):\n inputs = [] if inputs is None else inputs\n outputs = [] if outputs is None else outputs\n constraints = [] if constraints is None else constraints\n return cls(\n inputs=Inputs(features=inputs),\n outputs=Outputs(features=outputs),\n constraints=Constraints(constraints=constraints),\n )\n\n @field_validator(\"inputs\", mode=\"before\")\n @classmethod\n def validate_inputs_list(cls, v):\n if isinstance(v, collections.abc.Sequence):\n v = Inputs(features=v)\n return v\n if isinstance_or_union(v, AnyInput):\n return Inputs(features=[v])\n else:\n return v\n\n @field_validator(\"outputs\", mode=\"before\")\n @classmethod\n def validate_outputs_list(cls, v):\n if isinstance(v, collections.abc.Sequence):\n return Outputs(features=v)\n if isinstance_or_union(v, AnyOutput):\n return Outputs(features=[v])\n else:\n return v\n\n @field_validator(\"constraints\", mode=\"before\")\n @classmethod\n def validate_constraints_list(cls, v):\n if isinstance(v, list):\n return Constraints(constraints=v)\n if isinstance_or_union(v, AnyConstraint):\n return Constraints(constraints=[v])\n else:\n return v\n\n @model_validator(mode=\"after\")\n def validate_unique_feature_keys(self):\n \"\"\"Validates if provided input and output feature keys are unique\n\n Args:\n v (Outputs): List of all output features of the domain.\n value (Dict[str, Inputs]): Dict containing a list of input features as single entry.\n\n Raises:\n ValueError: Feature keys are not unique.\n\n Returns:\n Outputs: Keeps output features as given.\n \"\"\"\n\n keys = self.outputs.get_keys() + self.inputs.get_keys()\n if len(set(keys)) != len(keys):\n raise ValueError(\"Feature keys are not unique\")\n return self\n\n @model_validator(mode=\"after\")\n def validate_constraints(self):\n \"\"\"Validate if all features included in the constraints are also defined as features for the domain.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: Feature key in constraint is unknown.\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n\n keys = self.inputs.get_keys()\n for c in self.constraints.get(\n [LinearConstraint, NChooseKConstraint, ProductConstraint]\n ):\n for f in c.features: # type: ignore\n if f not in keys:\n raise ValueError(f\"feature {f} in constraint unknown ({keys})\")\n return self\n\n @model_validator(mode=\"after\")\n def validate_linear_constraints_and_nchoosek(self):\n \"\"\"Validate if all features included in linear constraints are continuous ones.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: _description_\n\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n keys = self.inputs.get_keys(ContinuousInput)\n\n # check if non continuous input features appear in linear constraints\n for c in self.constraints.get(includes=[LinearConstraint, NChooseKConstraint]):\n for f in c.features: # type: ignore\n assert f in keys, f\"{f} must be continuous.\"\n return self\n\n # TODO: tidy this up\n def get_nchoosek_combinations(self, exhaustive: bool = False): # noqa: C901\n \"\"\"get all possible NChooseK combinations\n\n Args:\n exhaustive (bool, optional): if True all combinations are returned. Defaults to False.\n\n Returns:\n Tuple(used_features_list, unused_features_list): used_features_list is a list of lists containing features used in each NChooseK combination.\n unused_features_list is a list of lists containing features unused in each NChooseK combination.\n \"\"\"\n\n if len(self.constraints.get(NChooseKConstraint)) == 0:\n used_continuous_features = self.inputs.get_keys(ContinuousInput)\n return used_continuous_features, []\n\n used_features_list_all = []\n\n # loops through each NChooseK constraint\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n used_features_list = []\n\n if exhaustive:\n for n in range(con.min_count, con.max_count + 1):\n used_features_list.extend(itertools.combinations(con.features, n))\n\n if con.none_also_valid:\n used_features_list.append(())\n else:\n used_features_list.extend(\n itertools.combinations(con.features, con.max_count)\n )\n\n used_features_list_all.append(used_features_list)\n\n used_features_list_all = list(\n itertools.product(*used_features_list_all)\n ) # product between NChooseK constraints\n\n # format into a list of used features\n used_features_list_formatted = []\n for used_features_list in used_features_list_all:\n used_features_list_flattened = [\n item for sublist in used_features_list for item in sublist\n ]\n used_features_list_formatted.append(list(set(used_features_list_flattened)))\n\n # sort lists\n used_features_list_sorted = []\n for used_features in used_features_list_formatted:\n used_features_list_sorted.append(sorted(used_features))\n\n # drop duplicates\n used_features_list_no_dup = []\n for used_features in used_features_list_sorted:\n if used_features not in used_features_list_no_dup:\n used_features_list_no_dup.append(used_features)\n\n # print(f\"duplicates dropped: {len(used_features_list_sorted)-len(used_features_list_no_dup)}\")\n\n # remove combinations not fulfilling constraints\n used_features_list_final = []\n for combo in used_features_list_no_dup:\n fulfil_constraints = (\n []\n ) # list of bools tracking if constraints are fulfilled\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n count = 0 # count of features in combo that are in con.features\n for f in combo:\n if f in con.features:\n count += 1\n if count >= con.min_count and count <= con.max_count:\n fulfil_constraints.append(True)\n elif count == 0 and con.none_also_valid:\n fulfil_constraints.append(True)\n else:\n fulfil_constraints.append(False)\n if np.all(fulfil_constraints):\n used_features_list_final.append(combo)\n\n # print(f\"violators dropped: {len(used_features_list_no_dup)-len(used_features_list_final)}\")\n\n # features unused\n features_in_cc = []\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n features_in_cc.extend(con.features)\n features_in_cc = list(set(features_in_cc))\n features_in_cc.sort()\n unused_features_list = []\n for used_features in used_features_list_final:\n unused_features_list.append(\n [f_key for f_key in features_in_cc if f_key not in used_features]\n )\n\n # postprocess\n # used_features_list_final2 = []\n # unused_features_list2 = []\n # for used, unused in zip(used_features_list_final,unused_features_list):\n # if len(used) == 3:\n # used_features_list_final2.append(used), unused_features_list2.append(unused)\n\n return used_features_list_final, unused_features_list\n\n def coerce_invalids(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Coerces all invalid output measurements to np.nan\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n\n Returns:\n pd.DataFrame: coerced dataframe\n \"\"\"\n # coerce invalid to nan\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[f\"valid_{feat}\"] == 0, feat] = np.nan\n return experiments\n\n def aggregate_by_duplicates(\n self,\n experiments: pd.DataFrame,\n prec: int,\n delimiter: str = \"-\",\n method: Literal[\"mean\", \"median\"] = \"mean\",\n ) -> Tuple[pd.DataFrame, list]:\n \"\"\"Aggregate the dataframe by duplicate experiments\n\n Duplicates are identified based on the experiments with the same input features. Continuous input features\n are rounded before identifying the duplicates. Aggregation is performed by taking the average of the\n involved output features.\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n prec (int): Precision of the rounding of the continuous input features\n delimiter (str, optional): Delimiter used when combining the orig. labcodes to a new one. Defaults to \"-\".\n\n Returns:\n Tuple[pd.DataFrame, list]: Dataframe holding the aggregated experiments, list of lists holding the labcodes of the duplicates\n \"\"\"\n # prepare the parent frame\n if method not in [\"mean\", \"median\"]:\n raise ValueError(f\"Unknown aggregation type provided: {method}\")\n\n preprocessed = self.outputs.preprocess_experiments_any_valid_output(experiments)\n assert preprocessed is not None\n experiments = preprocessed.copy()\n if \"labcode\" not in experiments.columns:\n experiments[\"labcode\"] = [\n str(i + 1).zfill(int(np.ceil(np.log10(experiments.shape[0]))))\n for i in range(experiments.shape[0])\n ]\n\n # round it if continuous inputs are present\n if len(self.inputs.get(ContinuousInput)) > 0:\n experiments[self.inputs.get_keys(ContinuousInput)] = experiments[\n self.inputs.get_keys(ContinuousInput)\n ].round(prec)\n\n # coerce invalid to nan\n experiments = self.coerce_invalids(experiments)\n\n # group and aggregate\n agg: Dict[str, Any] = {\n feat: method for feat in self.outputs.get_keys(ContinuousOutput)\n }\n agg[\"labcode\"] = lambda x: delimiter.join(sorted(x.tolist()))\n for feat in self.outputs.get_keys(Output):\n agg[f\"valid_{feat}\"] = lambda x: 1\n\n grouped = experiments.groupby(self.inputs.get_keys(Input))\n duplicated_labcodes = [\n sorted(group.labcode.to_numpy().tolist())\n for _, group in grouped\n if group.shape[0] > 1\n ]\n\n experiments = grouped.aggregate(agg).reset_index(drop=False)\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[feat].isna(), f\"valid_{feat}\"] = 0\n\n experiments = experiments.sort_values(by=\"labcode\")\n experiments = experiments.reset_index(drop=True)\n return experiments, sorted(duplicated_labcodes)\n\n def validate_experiments(\n self,\n experiments: pd.DataFrame,\n strict: bool = False,\n ) -> pd.DataFrame:\n \"\"\"checks the experimental data on validity\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Raises:\n ValueError: empty dataframe\n ValueError: the column for a specific feature is missing the provided data\n ValueError: there are labcodes with null value\n ValueError: there are labcodes with nan value\n ValueError: labcodes are not unique\n ValueError: the provided columns do no match to the defined domain\n ValueError: the provided columns do no match to the defined domain\n ValueError: Input with null values\n ValueError: Input with nan values\n\n Returns:\n pd.DataFrame: The provided dataframe with experimental data\n \"\"\"\n\n if len(experiments) == 0:\n raise ValueError(\"no experiments provided (empty dataframe)\")\n # we allow here for a column named labcode used to identify experiments\n if \"labcode\" in experiments.columns:\n # test that labcodes are not na\n if experiments.labcode.isnull().to_numpy().any():\n raise ValueError(\"there are labcodes with null value\")\n if experiments.labcode.isna().to_numpy().any():\n raise ValueError(\"there are labcodes with nan value\")\n # test that labcodes are distinct\n if (\n len(set(experiments.labcode.to_numpy().tolist()))\n != experiments.shape[0]\n ):\n raise ValueError(\"labcodes are not unique\")\n # run the individual validators\n experiments = self.inputs.validate_experiments(\n experiments=experiments, strict=strict\n )\n experiments = self.outputs.validate_experiments(experiments=experiments)\n return experiments\n\n def describe_experiments(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Method to get a tabular overview of how many measurements and how many valid entries are included in the input data for each output feature\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Returns:\n pd.DataFrame: Dataframe with counts how many measurements and how many valid entries are included in the input data for each output feature\n \"\"\"\n data = {}\n for feat in self.outputs.get_keys(Output):\n data[feat] = [\n experiments.loc[experiments[feat].notna()].shape[0],\n experiments.loc[experiments[feat].notna(), \"valid_%s\" % feat].sum(),\n ]\n preprocessed = self.outputs.preprocess_experiments_all_valid_outputs(\n experiments\n )\n assert preprocessed is not None\n data[\"all\"] = [\n experiments.shape[0],\n preprocessed.shape[0],\n ]\n return pd.DataFrame.from_dict(\n data, orient=\"index\", columns=[\"measured\", \"valid\"]\n )\n\n def validate_candidates(\n self,\n candidates: pd.DataFrame,\n only_inputs: bool = False,\n tol: float = 1e-5,\n raise_validation_error: bool = True,\n ) -> pd.DataFrame:\n \"\"\"Method to check the validty of porposed candidates\n\n Args:\n candidates (pd.DataFrame): Dataframe with suggested new experiments (candidates)\n only_inputs (bool,optional): If True, only the input columns are validated. Defaults to False.\n tol (float,optional): tolerance parameter for constraints. A constraint is considered as not fulfilled if the violation\n is larger than tol. Defaults to 1e-6.\n raise_validation_error (bool, optional): If true an error will be raised if candidates violate constraints,\n otherwise only a warning will be displayed. Defaults to True.\n\n Raises:\n ValueError: when a column is missing for a defined input feature\n ValueError: when a column is missing for a defined output feature\n ValueError: when a non-numerical value is proposed\n ValueError: when an additional column is found\n ConstraintNotFulfilledError: when the constraints are not fulfilled and `raise_validation_error = True`\n\n Returns:\n pd.DataFrame: dataframe with suggested experiments (candidates)\n \"\"\"\n # check that each input feature has a col and is valid in itself\n assert isinstance(self.inputs, Inputs)\n candidates = self.inputs.validate_candidates(candidates)\n # check if all constraints are fulfilled\n if not self.constraints.is_fulfilled(candidates, tol=tol).all():\n if raise_validation_error:\n raise ConstraintNotFulfilledError(\n f\"Constraints not fulfilled: {candidates}\"\n )\n warnings.warn(\"Not all constraints are fulfilled.\")\n # for each continuous output feature with an attached objective object\n if not only_inputs:\n assert isinstance(self.outputs, Outputs)\n candidates = self.outputs.validate_candidates(candidates=candidates)\n return candidates\n\n @property\n def experiment_column_names(self):\n \"\"\"the columns in the experimental dataframe\n\n Returns:\n List[str]: List of columns in the experiment dataframe (output feature keys + valid_output feature keys)\n \"\"\"\n return (self.inputs + self.outputs).get_keys() + [\n f\"valid_{output_feature_key}\"\n for output_feature_key in self.outputs.get_keys(Output)\n ]\n\n @property\n def candidate_column_names(self):\n \"\"\"the columns in the candidate dataframe\n\n Returns:\n List[str]: List of columns in the candidate dataframe (input feature keys + input feature keys_pred, input feature keys_sd, input feature keys_des)\n \"\"\"\n assert isinstance(self.outputs, Outputs)\n return (\n self.inputs.get_keys(Input)\n + [\n f\"{output_feature_key}_pred\"\n for output_feature_key in self.outputs.get_keys_by_objective(Objective)\n ]\n + [\n f\"{output_feature_key}_sd\"\n for output_feature_key in self.outputs.get_keys_by_objective(Objective)\n ]\n + [\n f\"{output_feature_key}_des\"\n for output_feature_key in self.outputs.get_keys_by_objective(Objective)\n ]\n )\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.candidate_column_names","title":"candidate_column_names
property
readonly
","text":"the columns in the candidate dataframe
Returns:
Type DescriptionList[str]
List of columns in the candidate dataframe (input feature keys + input feature keys_pred, input feature keys_sd, input feature keys_des)
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.experiment_column_names","title":"experiment_column_names
property
readonly
","text":"the columns in the experimental dataframe
Returns:
Type DescriptionList[str]
List of columns in the experiment dataframe (output feature keys + valid_output feature keys)
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.aggregate_by_duplicates","title":"aggregate_by_duplicates(self, experiments, prec, delimiter='-', method='mean')
","text":"Aggregate the dataframe by duplicate experiments
Duplicates are identified based on the experiments with the same input features. Continuous input features are rounded before identifying the duplicates. Aggregation is performed by taking the average of the involved output features.
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe containing experimental data
requiredprec
int
Precision of the rounding of the continuous input features
requireddelimiter
str
Delimiter used when combining the orig. labcodes to a new one. Defaults to \"-\".
'-'
Returns:
Type DescriptionTuple[pd.DataFrame, list]
Dataframe holding the aggregated experiments, list of lists holding the labcodes of the duplicates
Source code inbofire/data_models/domain/domain.py
def aggregate_by_duplicates(\n self,\n experiments: pd.DataFrame,\n prec: int,\n delimiter: str = \"-\",\n method: Literal[\"mean\", \"median\"] = \"mean\",\n) -> Tuple[pd.DataFrame, list]:\n \"\"\"Aggregate the dataframe by duplicate experiments\n\n Duplicates are identified based on the experiments with the same input features. Continuous input features\n are rounded before identifying the duplicates. Aggregation is performed by taking the average of the\n involved output features.\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n prec (int): Precision of the rounding of the continuous input features\n delimiter (str, optional): Delimiter used when combining the orig. labcodes to a new one. Defaults to \"-\".\n\n Returns:\n Tuple[pd.DataFrame, list]: Dataframe holding the aggregated experiments, list of lists holding the labcodes of the duplicates\n \"\"\"\n # prepare the parent frame\n if method not in [\"mean\", \"median\"]:\n raise ValueError(f\"Unknown aggregation type provided: {method}\")\n\n preprocessed = self.outputs.preprocess_experiments_any_valid_output(experiments)\n assert preprocessed is not None\n experiments = preprocessed.copy()\n if \"labcode\" not in experiments.columns:\n experiments[\"labcode\"] = [\n str(i + 1).zfill(int(np.ceil(np.log10(experiments.shape[0]))))\n for i in range(experiments.shape[0])\n ]\n\n # round it if continuous inputs are present\n if len(self.inputs.get(ContinuousInput)) > 0:\n experiments[self.inputs.get_keys(ContinuousInput)] = experiments[\n self.inputs.get_keys(ContinuousInput)\n ].round(prec)\n\n # coerce invalid to nan\n experiments = self.coerce_invalids(experiments)\n\n # group and aggregate\n agg: Dict[str, Any] = {\n feat: method for feat in self.outputs.get_keys(ContinuousOutput)\n }\n agg[\"labcode\"] = lambda x: delimiter.join(sorted(x.tolist()))\n for feat in self.outputs.get_keys(Output):\n agg[f\"valid_{feat}\"] = lambda x: 1\n\n grouped = experiments.groupby(self.inputs.get_keys(Input))\n duplicated_labcodes = [\n sorted(group.labcode.to_numpy().tolist())\n for _, group in grouped\n if group.shape[0] > 1\n ]\n\n experiments = grouped.aggregate(agg).reset_index(drop=False)\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[feat].isna(), f\"valid_{feat}\"] = 0\n\n experiments = experiments.sort_values(by=\"labcode\")\n experiments = experiments.reset_index(drop=True)\n return experiments, sorted(duplicated_labcodes)\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.coerce_invalids","title":"coerce_invalids(self, experiments)
","text":"Coerces all invalid output measurements to np.nan
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe containing experimental data
requiredReturns:
Type Descriptionpd.DataFrame
coerced dataframe
Source code inbofire/data_models/domain/domain.py
def coerce_invalids(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Coerces all invalid output measurements to np.nan\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n\n Returns:\n pd.DataFrame: coerced dataframe\n \"\"\"\n # coerce invalid to nan\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[f\"valid_{feat}\"] == 0, feat] = np.nan\n return experiments\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.describe_experiments","title":"describe_experiments(self, experiments)
","text":"Method to get a tabular overview of how many measurements and how many valid entries are included in the input data for each output feature
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe with experimental data
requiredReturns:
Type Descriptionpd.DataFrame
Dataframe with counts how many measurements and how many valid entries are included in the input data for each output feature
Source code inbofire/data_models/domain/domain.py
def describe_experiments(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Method to get a tabular overview of how many measurements and how many valid entries are included in the input data for each output feature\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Returns:\n pd.DataFrame: Dataframe with counts how many measurements and how many valid entries are included in the input data for each output feature\n \"\"\"\n data = {}\n for feat in self.outputs.get_keys(Output):\n data[feat] = [\n experiments.loc[experiments[feat].notna()].shape[0],\n experiments.loc[experiments[feat].notna(), \"valid_%s\" % feat].sum(),\n ]\n preprocessed = self.outputs.preprocess_experiments_all_valid_outputs(\n experiments\n )\n assert preprocessed is not None\n data[\"all\"] = [\n experiments.shape[0],\n preprocessed.shape[0],\n ]\n return pd.DataFrame.from_dict(\n data, orient=\"index\", columns=[\"measured\", \"valid\"]\n )\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.get_nchoosek_combinations","title":"get_nchoosek_combinations(self, exhaustive=False)
","text":"get all possible NChooseK combinations
Parameters:
Name Type Description Defaultexhaustive
bool
if True all combinations are returned. Defaults to False.
False
Returns:
Type DescriptionTuple(used_features_list, unused_features_list)
used_features_list is a list of lists containing features used in each NChooseK combination. unused_features_list is a list of lists containing features unused in each NChooseK combination.
Source code inbofire/data_models/domain/domain.py
def get_nchoosek_combinations(self, exhaustive: bool = False): # noqa: C901\n \"\"\"get all possible NChooseK combinations\n\n Args:\n exhaustive (bool, optional): if True all combinations are returned. Defaults to False.\n\n Returns:\n Tuple(used_features_list, unused_features_list): used_features_list is a list of lists containing features used in each NChooseK combination.\n unused_features_list is a list of lists containing features unused in each NChooseK combination.\n \"\"\"\n\n if len(self.constraints.get(NChooseKConstraint)) == 0:\n used_continuous_features = self.inputs.get_keys(ContinuousInput)\n return used_continuous_features, []\n\n used_features_list_all = []\n\n # loops through each NChooseK constraint\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n used_features_list = []\n\n if exhaustive:\n for n in range(con.min_count, con.max_count + 1):\n used_features_list.extend(itertools.combinations(con.features, n))\n\n if con.none_also_valid:\n used_features_list.append(())\n else:\n used_features_list.extend(\n itertools.combinations(con.features, con.max_count)\n )\n\n used_features_list_all.append(used_features_list)\n\n used_features_list_all = list(\n itertools.product(*used_features_list_all)\n ) # product between NChooseK constraints\n\n # format into a list of used features\n used_features_list_formatted = []\n for used_features_list in used_features_list_all:\n used_features_list_flattened = [\n item for sublist in used_features_list for item in sublist\n ]\n used_features_list_formatted.append(list(set(used_features_list_flattened)))\n\n # sort lists\n used_features_list_sorted = []\n for used_features in used_features_list_formatted:\n used_features_list_sorted.append(sorted(used_features))\n\n # drop duplicates\n used_features_list_no_dup = []\n for used_features in used_features_list_sorted:\n if used_features not in used_features_list_no_dup:\n used_features_list_no_dup.append(used_features)\n\n # print(f\"duplicates dropped: {len(used_features_list_sorted)-len(used_features_list_no_dup)}\")\n\n # remove combinations not fulfilling constraints\n used_features_list_final = []\n for combo in used_features_list_no_dup:\n fulfil_constraints = (\n []\n ) # list of bools tracking if constraints are fulfilled\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n count = 0 # count of features in combo that are in con.features\n for f in combo:\n if f in con.features:\n count += 1\n if count >= con.min_count and count <= con.max_count:\n fulfil_constraints.append(True)\n elif count == 0 and con.none_also_valid:\n fulfil_constraints.append(True)\n else:\n fulfil_constraints.append(False)\n if np.all(fulfil_constraints):\n used_features_list_final.append(combo)\n\n # print(f\"violators dropped: {len(used_features_list_no_dup)-len(used_features_list_final)}\")\n\n # features unused\n features_in_cc = []\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n features_in_cc.extend(con.features)\n features_in_cc = list(set(features_in_cc))\n features_in_cc.sort()\n unused_features_list = []\n for used_features in used_features_list_final:\n unused_features_list.append(\n [f_key for f_key in features_in_cc if f_key not in used_features]\n )\n\n # postprocess\n # used_features_list_final2 = []\n # unused_features_list2 = []\n # for used, unused in zip(used_features_list_final,unused_features_list):\n # if len(used) == 3:\n # used_features_list_final2.append(used), unused_features_list2.append(unused)\n\n return used_features_list_final, unused_features_list\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_candidates","title":"validate_candidates(self, candidates, only_inputs=False, tol=1e-05, raise_validation_error=True)
","text":"Method to check the validty of porposed candidates
Parameters:
Name Type Description Defaultcandidates
pd.DataFrame
Dataframe with suggested new experiments (candidates)
requiredonly_inputs
bool,optional
If True, only the input columns are validated. Defaults to False.
False
tol
float,optional
tolerance parameter for constraints. A constraint is considered as not fulfilled if the violation is larger than tol. Defaults to 1e-6.
1e-05
raise_validation_error
bool
If true an error will be raised if candidates violate constraints, otherwise only a warning will be displayed. Defaults to True.
True
Exceptions:
Type DescriptionValueError
when a column is missing for a defined input feature
ValueError
when a column is missing for a defined output feature
ValueError
when a non-numerical value is proposed
ValueError
when an additional column is found
ConstraintNotFulfilledError
when the constraints are not fulfilled and raise_validation_error = True
Returns:
Type Descriptionpd.DataFrame
dataframe with suggested experiments (candidates)
Source code inbofire/data_models/domain/domain.py
def validate_candidates(\n self,\n candidates: pd.DataFrame,\n only_inputs: bool = False,\n tol: float = 1e-5,\n raise_validation_error: bool = True,\n) -> pd.DataFrame:\n \"\"\"Method to check the validty of porposed candidates\n\n Args:\n candidates (pd.DataFrame): Dataframe with suggested new experiments (candidates)\n only_inputs (bool,optional): If True, only the input columns are validated. Defaults to False.\n tol (float,optional): tolerance parameter for constraints. A constraint is considered as not fulfilled if the violation\n is larger than tol. Defaults to 1e-6.\n raise_validation_error (bool, optional): If true an error will be raised if candidates violate constraints,\n otherwise only a warning will be displayed. Defaults to True.\n\n Raises:\n ValueError: when a column is missing for a defined input feature\n ValueError: when a column is missing for a defined output feature\n ValueError: when a non-numerical value is proposed\n ValueError: when an additional column is found\n ConstraintNotFulfilledError: when the constraints are not fulfilled and `raise_validation_error = True`\n\n Returns:\n pd.DataFrame: dataframe with suggested experiments (candidates)\n \"\"\"\n # check that each input feature has a col and is valid in itself\n assert isinstance(self.inputs, Inputs)\n candidates = self.inputs.validate_candidates(candidates)\n # check if all constraints are fulfilled\n if not self.constraints.is_fulfilled(candidates, tol=tol).all():\n if raise_validation_error:\n raise ConstraintNotFulfilledError(\n f\"Constraints not fulfilled: {candidates}\"\n )\n warnings.warn(\"Not all constraints are fulfilled.\")\n # for each continuous output feature with an attached objective object\n if not only_inputs:\n assert isinstance(self.outputs, Outputs)\n candidates = self.outputs.validate_candidates(candidates=candidates)\n return candidates\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_constraints","title":"validate_constraints(self)
","text":"Validate if all features included in the constraints are also defined as features for the domain.
Parameters:
Name Type Description Defaultv
List[Constraint]
List of constraints or empty if no constraints are defined
requiredvalues
List[Input]
List of input features of the domain
requiredExceptions:
Type DescriptionValueError
Feature key in constraint is unknown.
Returns:
Type DescriptionList[Constraint]
List of constraints defined for the domain
Source code inbofire/data_models/domain/domain.py
@model_validator(mode=\"after\")\ndef validate_constraints(self):\n \"\"\"Validate if all features included in the constraints are also defined as features for the domain.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: Feature key in constraint is unknown.\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n\n keys = self.inputs.get_keys()\n for c in self.constraints.get(\n [LinearConstraint, NChooseKConstraint, ProductConstraint]\n ):\n for f in c.features: # type: ignore\n if f not in keys:\n raise ValueError(f\"feature {f} in constraint unknown ({keys})\")\n return self\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_experiments","title":"validate_experiments(self, experiments, strict=False)
","text":"checks the experimental data on validity
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe with experimental data
requiredExceptions:
Type DescriptionValueError
empty dataframe
ValueError
the column for a specific feature is missing the provided data
ValueError
there are labcodes with null value
ValueError
there are labcodes with nan value
ValueError
labcodes are not unique
ValueError
the provided columns do no match to the defined domain
ValueError
the provided columns do no match to the defined domain
ValueError
Input with null values
ValueError
Input with nan values
Returns:
Type Descriptionpd.DataFrame
The provided dataframe with experimental data
Source code inbofire/data_models/domain/domain.py
def validate_experiments(\n self,\n experiments: pd.DataFrame,\n strict: bool = False,\n) -> pd.DataFrame:\n \"\"\"checks the experimental data on validity\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Raises:\n ValueError: empty dataframe\n ValueError: the column for a specific feature is missing the provided data\n ValueError: there are labcodes with null value\n ValueError: there are labcodes with nan value\n ValueError: labcodes are not unique\n ValueError: the provided columns do no match to the defined domain\n ValueError: the provided columns do no match to the defined domain\n ValueError: Input with null values\n ValueError: Input with nan values\n\n Returns:\n pd.DataFrame: The provided dataframe with experimental data\n \"\"\"\n\n if len(experiments) == 0:\n raise ValueError(\"no experiments provided (empty dataframe)\")\n # we allow here for a column named labcode used to identify experiments\n if \"labcode\" in experiments.columns:\n # test that labcodes are not na\n if experiments.labcode.isnull().to_numpy().any():\n raise ValueError(\"there are labcodes with null value\")\n if experiments.labcode.isna().to_numpy().any():\n raise ValueError(\"there are labcodes with nan value\")\n # test that labcodes are distinct\n if (\n len(set(experiments.labcode.to_numpy().tolist()))\n != experiments.shape[0]\n ):\n raise ValueError(\"labcodes are not unique\")\n # run the individual validators\n experiments = self.inputs.validate_experiments(\n experiments=experiments, strict=strict\n )\n experiments = self.outputs.validate_experiments(experiments=experiments)\n return experiments\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_linear_constraints_and_nchoosek","title":"validate_linear_constraints_and_nchoosek(self)
","text":"Validate if all features included in linear constraints are continuous ones.
Parameters:
Name Type Description Defaultv
List[Constraint]
List of constraints or empty if no constraints are defined
requiredvalues
List[Input]
List of input features of the domain
requiredExceptions:
Type DescriptionValueError
description
Returns:
Type DescriptionList[Constraint]
List of constraints defined for the domain
Source code inbofire/data_models/domain/domain.py
@model_validator(mode=\"after\")\ndef validate_linear_constraints_and_nchoosek(self):\n \"\"\"Validate if all features included in linear constraints are continuous ones.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: _description_\n\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n keys = self.inputs.get_keys(ContinuousInput)\n\n # check if non continuous input features appear in linear constraints\n for c in self.constraints.get(includes=[LinearConstraint, NChooseKConstraint]):\n for f in c.features: # type: ignore\n assert f in keys, f\"{f} must be continuous.\"\n return self\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_unique_feature_keys","title":"validate_unique_feature_keys(self)
","text":"Validates if provided input and output feature keys are unique
Parameters:
Name Type Description Defaultv
Outputs
List of all output features of the domain.
requiredvalue
Dict[str, Inputs]
Dict containing a list of input features as single entry.
requiredExceptions:
Type DescriptionValueError
Feature keys are not unique.
Returns:
Type DescriptionOutputs
Keeps output features as given.
Source code inbofire/data_models/domain/domain.py
@model_validator(mode=\"after\")\ndef validate_unique_feature_keys(self):\n \"\"\"Validates if provided input and output feature keys are unique\n\n Args:\n v (Outputs): List of all output features of the domain.\n value (Dict[str, Inputs]): Dict containing a list of input features as single entry.\n\n Raises:\n ValueError: Feature keys are not unique.\n\n Returns:\n Outputs: Keeps output features as given.\n \"\"\"\n\n keys = self.outputs.get_keys() + self.inputs.get_keys()\n if len(set(keys)) != len(keys):\n raise ValueError(\"Feature keys are not unique\")\n return self\n
"},{"location":"ref-features/","title":"Domain","text":""},{"location":"ref-features/#bofire.data_models.features.categorical","title":"categorical
","text":""},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput","title":" CategoricalInput (Input)
","text":"Base class for all categorical input features.
Attributes:
Name Type Descriptioncategories
List[str]
Names of the categories.
allowed
List[bool]
List of bools indicating if a category is allowed within the optimization.
Source code inbofire/data_models/features/categorical.py
class CategoricalInput(Input):\n \"\"\"Base class for all categorical input features.\n\n Attributes:\n categories (List[str]): Names of the categories.\n allowed (List[bool]): List of bools indicating if a category is allowed within the optimization.\n \"\"\"\n\n type: Literal[\"CategoricalInput\"] = \"CategoricalInput\"\n # order_id: ClassVar[int] = 5\n order_id: ClassVar[int] = 7\n\n categories: CategoryVals\n allowed: Optional[Annotated[List[bool], Field(min_length=2)]] = Field(\n default=None, validate_default=True\n )\n\n @field_validator(\"allowed\")\n @classmethod\n def generate_allowed(cls, allowed, info):\n \"\"\"Generates the list of allowed categories if not provided.\"\"\"\n if allowed is None and \"categories\" in info.data.keys():\n return [True for _ in range(len(info.data[\"categories\"]))]\n return allowed\n\n @model_validator(mode=\"after\")\n def validate_categories_fitting_allowed(self):\n if len(self.allowed) != len(self.categories): # type: ignore\n raise ValueError(\"allowed must have same length as categories\")\n if sum(self.allowed) == 0: # type: ignore\n raise ValueError(\"no category is allowed\")\n return self\n\n @staticmethod\n def valid_transform_types() -> List[CategoricalEncodingEnum]:\n return [\n CategoricalEncodingEnum.ONE_HOT,\n CategoricalEncodingEnum.DUMMY,\n CategoricalEncodingEnum.ORDINAL,\n ]\n\n def is_fixed(self) -> bool:\n \"\"\"Returns True if there is only one allowed category.\n\n Returns:\n [bool]: True if there is only one allowed category\n \"\"\"\n if self.allowed is None:\n return False\n return sum(self.allowed) == 1\n\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if self.is_fixed():\n val = self.get_allowed_categories()[0]\n if transform_type is None:\n return [val]\n elif transform_type == CategoricalEncodingEnum.ONE_HOT:\n return self.to_onehot_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.DUMMY:\n return self.to_dummy_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.ORDINAL:\n return self.to_ordinal_encoding(pd.Series([val])).tolist()\n else:\n raise ValueError(\n f\"Unkwon transform type {transform_type} for categorical input {self.key}\"\n )\n else:\n return None\n\n def get_allowed_categories(self):\n \"\"\"Returns the allowed categories.\n\n Returns:\n list of str: The allowed categories\n \"\"\"\n if self.allowed is None:\n return []\n return [c for c, a in zip(self.categories, self.allowed) if a]\n\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n if strict:\n possible_categories = self.get_possible_categories(values)\n if len(possible_categories) != len(self.categories):\n raise ValueError(\n f\"Categories {list(set(self.categories)-set(possible_categories))} of feature {self.key} not used. Remove them.\"\n )\n return values\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when not all values for a feature are one of the allowed categories\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.get_allowed_categories())) != len(values):\n raise ValueError(\n f\"not all values of input feature `{self.key}` are a valid allowed category from {self.get_allowed_categories()}\"\n )\n return values\n\n def get_forbidden_categories(self):\n \"\"\"Returns the non-allowed categories\n\n Returns:\n List[str]: List of the non-allowed categories\n \"\"\"\n return list(set(self.categories) - set(self.get_allowed_categories()))\n\n def get_possible_categories(self, values: pd.Series) -> list:\n \"\"\"Return the superset of categories that have been used in the experimental dataset and\n that can be used in the optimization\n\n Args:\n values (pd.Series): Series with the values for this feature\n\n Returns:\n list: list of possible categories\n \"\"\"\n return sorted(set(list(set(values.tolist())) + self.get_allowed_categories()))\n\n def to_onehot_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a one-hot encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: One-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories},\n dtype=float,\n index=values.index,\n )\n\n def from_onehot_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from one-hot encoding.\n\n Args:\n values (pd.DataFrame): One-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n\n def to_dummy_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a dummy-hot encoding, dropping the first categorical level.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: Dummy-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories[1:]},\n dtype=float,\n index=values.index,\n )\n\n def from_dummy_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Convert points back from dummy encoding.\n\n Args:\n values (pd.DataFrame): Dummy-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols[1:]]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols[1:]}.\"\n )\n values = values.copy()\n values[cat_cols[0]] = 1 - values[cat_cols[1:]].sum(axis=1)\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n\n def to_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Converts values to an ordinal integer based encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.Series: Ordinal encoded values.\n \"\"\"\n enc = pd.Series(range(len(self.categories)), index=list(self.categories))\n s = enc[values]\n s.index = values.index\n s.name = self.key\n return s\n\n def from_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Convertes values back from ordinal encoding.\n\n Args:\n values (pd.Series): Ordinal encoded series.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n enc = np.array(self.categories)\n return pd.Series(enc[values], index=values.index, name=self.key)\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).choice(\n self.get_allowed_categories(), n\n ),\n )\n\n def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n assert isinstance(transform_type, CategoricalEncodingEnum)\n if transform_type == CategoricalEncodingEnum.ORDINAL:\n return [0], [len(self.categories) - 1]\n if transform_type == CategoricalEncodingEnum.ONE_HOT:\n # in the case that values are None, we return the bounds\n # based on the optimization bounds, else we return the true\n # bounds as this is for model fitting.\n if values is None:\n lower = [0.0 for _ in self.categories]\n upper = [\n 1.0 if self.allowed[i] is True else 0.0 # type: ignore\n for i, _ in enumerate(self.categories)\n ]\n else:\n lower = [0.0 for _ in self.categories]\n upper = [1.0 for _ in self.categories]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DUMMY:\n lower = [0.0 for _ in range(len(self.categories) - 1)]\n upper = [1.0 for _ in range(len(self.categories) - 1)]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DESCRIPTOR:\n raise ValueError(\n f\"Invalid descriptor transform for categorical {self.key}.\"\n )\n else:\n raise ValueError(\n f\"Invalid transform_type {transform_type} provided for categorical {self.key}.\"\n )\n\n def __str__(self) -> str:\n \"\"\"Returns the number of categories as str\n\n Returns:\n str: Number of categories\n \"\"\"\n return f\"{len(self.categories)} categories\"\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.__str__","title":"__str__(self)
special
","text":"Returns the number of categories as str
Returns:
Type Descriptionstr
Number of categories
Source code inbofire/data_models/features/categorical.py
def __str__(self) -> str:\n \"\"\"Returns the number of categories as str\n\n Returns:\n str: Number of categories\n \"\"\"\n return f\"{len(self.categories)} categories\"\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Returns the categories to which the feature is fixed, None if the feature is not fixed
Returns:
Type DescriptionList[str]
List of categories or None
Source code inbofire/data_models/features/categorical.py
def fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if self.is_fixed():\n val = self.get_allowed_categories()[0]\n if transform_type is None:\n return [val]\n elif transform_type == CategoricalEncodingEnum.ONE_HOT:\n return self.to_onehot_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.DUMMY:\n return self.to_dummy_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.ORDINAL:\n return self.to_ordinal_encoding(pd.Series([val])).tolist()\n else:\n raise ValueError(\n f\"Unkwon transform type {transform_type} for categorical input {self.key}\"\n )\n else:\n return None\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.from_dummy_encoding","title":"from_dummy_encoding(self, values)
","text":"Convert points back from dummy encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Dummy-hot encoded values.
requiredExceptions:
Type DescriptionValueError
If one-hot columns not present in values
.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/categorical.py
def from_dummy_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Convert points back from dummy encoding.\n\n Args:\n values (pd.DataFrame): Dummy-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols[1:]]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols[1:]}.\"\n )\n values = values.copy()\n values[cat_cols[0]] = 1 - values[cat_cols[1:]].sum(axis=1)\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.from_onehot_encoding","title":"from_onehot_encoding(self, values)
","text":"Converts values back from one-hot encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
One-hot encoded values.
requiredExceptions:
Type DescriptionValueError
If one-hot columns not present in values
.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/categorical.py
def from_onehot_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from one-hot encoding.\n\n Args:\n values (pd.DataFrame): One-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.from_ordinal_encoding","title":"from_ordinal_encoding(self, values)
","text":"Convertes values back from ordinal encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Ordinal encoded series.
requiredReturns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/categorical.py
def from_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Convertes values back from ordinal encoding.\n\n Args:\n values (pd.Series): Ordinal encoded series.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n enc = np.array(self.categories)\n return pd.Series(enc[values], index=values.index, name=self.key)\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.generate_allowed","title":"generate_allowed(allowed, info)
classmethod
","text":"Generates the list of allowed categories if not provided.
Source code inbofire/data_models/features/categorical.py
@field_validator(\"allowed\")\n@classmethod\ndef generate_allowed(cls, allowed, info):\n \"\"\"Generates the list of allowed categories if not provided.\"\"\"\n if allowed is None and \"categories\" in info.data.keys():\n return [True for _ in range(len(info.data[\"categories\"]))]\n return allowed\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_allowed_categories","title":"get_allowed_categories(self)
","text":"Returns the allowed categories.
Returns:
Type Descriptionlist of str
The allowed categories
Source code inbofire/data_models/features/categorical.py
def get_allowed_categories(self):\n \"\"\"Returns the allowed categories.\n\n Returns:\n list of str: The allowed categories\n \"\"\"\n if self.allowed is None:\n return []\n return [c for c, a in zip(self.categories, self.allowed) if a]\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_bounds","title":"get_bounds(self, transform_type, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
requiredvalues
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/categorical.py
def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n assert isinstance(transform_type, CategoricalEncodingEnum)\n if transform_type == CategoricalEncodingEnum.ORDINAL:\n return [0], [len(self.categories) - 1]\n if transform_type == CategoricalEncodingEnum.ONE_HOT:\n # in the case that values are None, we return the bounds\n # based on the optimization bounds, else we return the true\n # bounds as this is for model fitting.\n if values is None:\n lower = [0.0 for _ in self.categories]\n upper = [\n 1.0 if self.allowed[i] is True else 0.0 # type: ignore\n for i, _ in enumerate(self.categories)\n ]\n else:\n lower = [0.0 for _ in self.categories]\n upper = [1.0 for _ in self.categories]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DUMMY:\n lower = [0.0 for _ in range(len(self.categories) - 1)]\n upper = [1.0 for _ in range(len(self.categories) - 1)]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DESCRIPTOR:\n raise ValueError(\n f\"Invalid descriptor transform for categorical {self.key}.\"\n )\n else:\n raise ValueError(\n f\"Invalid transform_type {transform_type} provided for categorical {self.key}.\"\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_forbidden_categories","title":"get_forbidden_categories(self)
","text":"Returns the non-allowed categories
Returns:
Type DescriptionList[str]
List of the non-allowed categories
Source code inbofire/data_models/features/categorical.py
def get_forbidden_categories(self):\n \"\"\"Returns the non-allowed categories\n\n Returns:\n List[str]: List of the non-allowed categories\n \"\"\"\n return list(set(self.categories) - set(self.get_allowed_categories()))\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_possible_categories","title":"get_possible_categories(self, values)
","text":"Return the superset of categories that have been used in the experimental dataset and that can be used in the optimization
Parameters:
Name Type Description Defaultvalues
pd.Series
Series with the values for this feature
requiredReturns:
Type Descriptionlist
list of possible categories
Source code inbofire/data_models/features/categorical.py
def get_possible_categories(self, values: pd.Series) -> list:\n \"\"\"Return the superset of categories that have been used in the experimental dataset and\n that can be used in the optimization\n\n Args:\n values (pd.Series): Series with the values for this feature\n\n Returns:\n list: list of possible categories\n \"\"\"\n return sorted(set(list(set(values.tolist())) + self.get_allowed_categories()))\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.is_fixed","title":"is_fixed(self)
","text":"Returns True if there is only one allowed category.
Returns:
Type Description[bool]
True if there is only one allowed category
Source code inbofire/data_models/features/categorical.py
def is_fixed(self) -> bool:\n \"\"\"Returns True if there is only one allowed category.\n\n Returns:\n [bool]: True if there is only one allowed category\n \"\"\"\n if self.allowed is None:\n return False\n return sum(self.allowed) == 1\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.sample","title":"sample(self, n, seed=None)
","text":"Draw random samples from the feature.
Parameters:
Name Type Description Defaultn
int
number of samples.
requiredReturns:
Type Descriptionpd.Series
drawn samples.
Source code inbofire/data_models/features/categorical.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).choice(\n self.get_allowed_categories(), n\n ),\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.to_dummy_encoding","title":"to_dummy_encoding(self, values)
","text":"Converts values to a dummy-hot encoding, dropping the first categorical level.
Parameters:
Name Type Description Defaultvalues
pd.Series
Series to be transformed.
requiredReturns:
Type Descriptionpd.DataFrame
Dummy-hot transformed data frame.
Source code inbofire/data_models/features/categorical.py
def to_dummy_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a dummy-hot encoding, dropping the first categorical level.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: Dummy-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories[1:]},\n dtype=float,\n index=values.index,\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.to_onehot_encoding","title":"to_onehot_encoding(self, values)
","text":"Converts values to a one-hot encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Series to be transformed.
requiredReturns:
Type Descriptionpd.DataFrame
One-hot transformed data frame.
Source code inbofire/data_models/features/categorical.py
def to_onehot_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a one-hot encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: One-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories},\n dtype=float,\n index=values.index,\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.to_ordinal_encoding","title":"to_ordinal_encoding(self, values)
","text":"Converts values to an ordinal integer based encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Series to be transformed.
requiredReturns:
Type Descriptionpd.Series
Ordinal encoded values.
Source code inbofire/data_models/features/categorical.py
def to_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Converts values to an ordinal integer based encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.Series: Ordinal encoded values.\n \"\"\"\n enc = pd.Series(range(len(self.categories)), index=list(self.categories))\n s = enc[values]\n s.index = values.index\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredExceptions:
Type DescriptionValueError
when not all values for a feature are one of the allowed categories
Returns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/categorical.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when not all values for a feature are one of the allowed categories\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.get_allowed_categories())) != len(values):\n raise ValueError(\n f\"not all values of input feature `{self.key}` are a valid allowed category from {self.get_allowed_categories()}\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Exceptions:
Type DescriptionValueError
when an entry is not in the list of allowed categories
ValueError
when there is no variation in a feature provided by the experimental data
Returns:
Type Descriptionpd.Series
A dataFrame with experiments
Source code inbofire/data_models/features/categorical.py
def validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n if strict:\n possible_categories = self.get_possible_categories(values)\n if len(possible_categories) != len(self.categories):\n raise ValueError(\n f\"Categories {list(set(self.categories)-set(possible_categories))} of feature {self.key} not used. Remove them.\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalOutput","title":" CategoricalOutput (Output)
","text":"Source code in bofire/data_models/features/categorical.py
class CategoricalOutput(Output):\n type: Literal[\"CategoricalOutput\"] = \"CategoricalOutput\"\n order_id: ClassVar[int] = 10\n\n categories: CategoryVals\n objective: AnyCategoricalObjective\n\n @model_validator(mode=\"after\")\n def validate_objective_categories(self):\n \"\"\"validates that objective categories match the output categories\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n self\n \"\"\"\n if self.objective.categories != self.categories: # type: ignore\n raise ValueError(\"categories must match to objective categories\")\n return self\n\n def __call__(self, values: pd.Series) -> pd.Series:\n if self.objective is None:\n return pd.Series(\n data=[np.nan for _ in range(len(values))],\n index=values.index,\n name=values.name,\n )\n return self.objective(values) # type: ignore\n\n def validate_experimental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n return values\n\n def __str__(self) -> str:\n return \"CategoricalOutputFeature\"\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalOutput.validate_experimental","title":"validate_experimental(self, values)
","text":"Abstract method to validate the experimental Series
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with values for the outcome
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/categorical.py
def validate_experimental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalOutput.validate_objective_categories","title":"validate_objective_categories(self)
","text":"validates that objective categories match the output categories
Exceptions:
Type DescriptionValueError
when categories do not match objective categories
Returns:
Type Descriptionself
Source code inbofire/data_models/features/categorical.py
@model_validator(mode=\"after\")\ndef validate_objective_categories(self):\n \"\"\"validates that objective categories match the output categories\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n self\n \"\"\"\n if self.objective.categories != self.categories: # type: ignore\n raise ValueError(\"categories must match to objective categories\")\n return self\n
"},{"location":"ref-features/#bofire.data_models.features.continuous","title":"continuous
","text":""},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput","title":" ContinuousInput (NumericalInput)
","text":"Base class for all continuous input features.
Attributes:
Name Type Descriptionbounds
Tuple[float, float]
A tuple that stores the lower and upper bound of the feature.
stepsize
float
Float indicating the allowed stepsize between lower and upper. Defaults to None.
local_relative_bounds
Tuple[float, float]
A tuple that stores the lower and upper bounds relative to a reference value. Defaults to None.
Source code inbofire/data_models/features/continuous.py
class ContinuousInput(NumericalInput):\n \"\"\"Base class for all continuous input features.\n\n Attributes:\n bounds (Tuple[float, float]): A tuple that stores the lower and upper bound of the feature.\n stepsize (float, optional): Float indicating the allowed stepsize between lower and upper. Defaults to None.\n local_relative_bounds (Tuple[float, float], optional): A tuple that stores the lower and upper bounds relative to a reference value.\n Defaults to None.\n \"\"\"\n\n type: Literal[\"ContinuousInput\"] = \"ContinuousInput\"\n order_id: ClassVar[int] = 1\n\n bounds: Tuple[float, float]\n local_relative_bounds: Optional[\n Tuple[Annotated[float, Field(gt=0)], Annotated[float, Field(gt=0)]]\n ] = None\n stepsize: Optional[float] = None\n\n @property\n def lower_bound(self) -> float:\n return self.bounds[0]\n\n @property\n def upper_bound(self) -> float:\n return self.bounds[1]\n\n @model_validator(mode=\"after\")\n def validate_step_size(self):\n if self.stepsize is None:\n return self\n lower, upper = self.bounds\n if lower == upper and self.stepsize is not None:\n raise ValueError(\n \"Stepsize cannot be provided for a fixed continuous input.\"\n )\n range = upper - lower\n if np.arange(lower, upper + self.stepsize, self.stepsize)[-1] != upper:\n raise ValueError(\n f\"Stepsize of {self.stepsize} does not match the provided interval [{lower},{upper}].\"\n )\n if range // self.stepsize == 1:\n raise ValueError(\"Stepsize is too big, only one value allowed.\")\n return self\n\n def round(self, values: pd.Series) -> pd.Series:\n \"\"\"Round values to the stepsize of the feature. If no stepsize is provided return the\n provided values.\n\n Args:\n values (pd.Series): The values that should be rounded.\n\n Returns:\n pd.Series: The rounded values\n \"\"\"\n if self.stepsize is None:\n return values\n self.validate_candidental(values=values)\n allowed_values = np.arange(\n self.lower_bound, self.upper_bound + self.stepsize, self.stepsize\n )\n idx = abs(values.values.reshape([len(values), 1]) - allowed_values).argmin( # type: ignore\n axis=1\n )\n return pd.Series(\n data=self.lower_bound + idx * self.stepsize, index=values.index\n )\n\n @field_validator(\"bounds\")\n @classmethod\n def validate_lower_upper(cls, bounds):\n \"\"\"Validates that the lower bound is lower than the upper bound\n\n Args:\n values (Dict): Dictionary with attributes key, lower and upper bound\n\n Raises:\n ValueError: when the lower bound is higher than the upper bound\n\n Returns:\n Dict: The attributes as dictionary\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when non numerical values are passed\n ValueError: when values are larger than the upper bound of the feature\n ValueError: when values are lower than the lower bound of the feature\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n\n noise = 10e-6\n values = super().validate_candidental(values)\n if (values < self.lower_bound - noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are larger than lower bound `{self.lower_bound}` \"\n )\n if (values > self.upper_bound + noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are smaller than upper bound `{self.upper_bound}` \"\n )\n return values\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).uniform(\n self.lower_bound, self.upper_bound, n\n ),\n )\n\n def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n ) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if reference_value is not None and values is not None:\n raise ValueError(\"Only one can be used, `local_value` or `values`.\")\n if values is None:\n if reference_value is None or self.is_fixed():\n return [self.lower_bound], [self.upper_bound]\n else:\n local_relative_bounds = self.local_relative_bounds or (\n math.inf,\n math.inf,\n )\n return [\n max(\n reference_value - local_relative_bounds[0],\n self.lower_bound,\n )\n ], [\n min(\n reference_value + local_relative_bounds[1],\n self.upper_bound,\n )\n ]\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper]\n\n def __str__(self) -> str:\n \"\"\"Method to return a string of lower and upper bound\n\n Returns:\n str: String of a list with lower and upper bound\n \"\"\"\n return f\"[{self.lower_bound},{self.upper_bound}]\"\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.__str__","title":"__str__(self)
special
","text":"Method to return a string of lower and upper bound
Returns:
Type Descriptionstr
String of a list with lower and upper bound
Source code inbofire/data_models/features/continuous.py
def __str__(self) -> str:\n \"\"\"Method to return a string of lower and upper bound\n\n Returns:\n str: String of a list with lower and upper bound\n \"\"\"\n return f\"[{self.lower_bound},{self.upper_bound}]\"\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.get_bounds","title":"get_bounds(self, transform_type=None, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
None
values
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/continuous.py
def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if reference_value is not None and values is not None:\n raise ValueError(\"Only one can be used, `local_value` or `values`.\")\n if values is None:\n if reference_value is None or self.is_fixed():\n return [self.lower_bound], [self.upper_bound]\n else:\n local_relative_bounds = self.local_relative_bounds or (\n math.inf,\n math.inf,\n )\n return [\n max(\n reference_value - local_relative_bounds[0],\n self.lower_bound,\n )\n ], [\n min(\n reference_value + local_relative_bounds[1],\n self.upper_bound,\n )\n ]\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper]\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.round","title":"round(self, values)
","text":"Round values to the stepsize of the feature. If no stepsize is provided return the provided values.
Parameters:
Name Type Description Defaultvalues
pd.Series
The values that should be rounded.
requiredReturns:
Type Descriptionpd.Series
The rounded values
Source code inbofire/data_models/features/continuous.py
def round(self, values: pd.Series) -> pd.Series:\n \"\"\"Round values to the stepsize of the feature. If no stepsize is provided return the\n provided values.\n\n Args:\n values (pd.Series): The values that should be rounded.\n\n Returns:\n pd.Series: The rounded values\n \"\"\"\n if self.stepsize is None:\n return values\n self.validate_candidental(values=values)\n allowed_values = np.arange(\n self.lower_bound, self.upper_bound + self.stepsize, self.stepsize\n )\n idx = abs(values.values.reshape([len(values), 1]) - allowed_values).argmin( # type: ignore\n axis=1\n )\n return pd.Series(\n data=self.lower_bound + idx * self.stepsize, index=values.index\n )\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.sample","title":"sample(self, n, seed=None)
","text":"Draw random samples from the feature.
Parameters:
Name Type Description Defaultn
int
number of samples.
requiredReturns:
Type Descriptionpd.Series
drawn samples.
Source code inbofire/data_models/features/continuous.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).uniform(\n self.lower_bound, self.upper_bound, n\n ),\n )\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredExceptions:
Type DescriptionValueError
when non numerical values are passed
ValueError
when values are larger than the upper bound of the feature
ValueError
when values are lower than the lower bound of the feature
Returns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/continuous.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when non numerical values are passed\n ValueError: when values are larger than the upper bound of the feature\n ValueError: when values are lower than the lower bound of the feature\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n\n noise = 10e-6\n values = super().validate_candidental(values)\n if (values < self.lower_bound - noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are larger than lower bound `{self.lower_bound}` \"\n )\n if (values > self.upper_bound + noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are smaller than upper bound `{self.upper_bound}` \"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.validate_lower_upper","title":"validate_lower_upper(bounds)
classmethod
","text":"Validates that the lower bound is lower than the upper bound
Parameters:
Name Type Description Defaultvalues
Dict
Dictionary with attributes key, lower and upper bound
requiredExceptions:
Type DescriptionValueError
when the lower bound is higher than the upper bound
Returns:
Type DescriptionDict
The attributes as dictionary
Source code inbofire/data_models/features/continuous.py
@field_validator(\"bounds\")\n@classmethod\ndef validate_lower_upper(cls, bounds):\n \"\"\"Validates that the lower bound is lower than the upper bound\n\n Args:\n values (Dict): Dictionary with attributes key, lower and upper bound\n\n Raises:\n ValueError: when the lower bound is higher than the upper bound\n\n Returns:\n Dict: The attributes as dictionary\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousOutput","title":" ContinuousOutput (Output)
","text":"The base class for a continuous output feature
Attributes:
Name Type Descriptionobjective
objective
objective of the feature indicating in which direction it should be optimzed. Defaults to MaximizeObjective
.
bofire/data_models/features/continuous.py
class ContinuousOutput(Output):\n \"\"\"The base class for a continuous output feature\n\n Attributes:\n objective (objective, optional): objective of the feature indicating in which direction it should be optimzed. Defaults to `MaximizeObjective`.\n \"\"\"\n\n type: Literal[\"ContinuousOutput\"] = \"ContinuousOutput\"\n order_id: ClassVar[int] = 9\n unit: Optional[str] = None\n\n objective: Optional[AnyObjective] = Field(\n default_factory=lambda: MaximizeObjective(w=1.0)\n )\n\n def __call__(self, values: pd.Series) -> pd.Series:\n if self.objective is None:\n return pd.Series(\n data=[np.nan for _ in range(len(values))],\n index=values.index,\n name=values.name,\n )\n return self.objective(values) # type: ignore\n\n def validate_experimental(self, values: pd.Series) -> pd.Series:\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n\n def __str__(self) -> str:\n return \"ContinuousOutputFeature\"\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousOutput.validate_experimental","title":"validate_experimental(self, values)
","text":"Abstract method to validate the experimental Series
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with values for the outcome
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/continuous.py
def validate_experimental(self, values: pd.Series) -> pd.Series:\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor","title":"descriptor
","text":""},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput","title":" CategoricalDescriptorInput (CategoricalInput)
","text":"Class for categorical input features with descriptors
Attributes:
Name Type Descriptioncategories
List[str]
Names of the categories.
allowed
List[bool]
List of bools indicating if a category is allowed within the optimization.
descriptors
List[str]
List of strings representing the names of the descriptors.
values
List[List[float]]
List of lists representing the descriptor values.
Source code inbofire/data_models/features/descriptor.py
class CategoricalDescriptorInput(CategoricalInput):\n \"\"\"Class for categorical input features with descriptors\n\n Attributes:\n categories (List[str]): Names of the categories.\n allowed (List[bool]): List of bools indicating if a category is allowed within the optimization.\n descriptors (List[str]): List of strings representing the names of the descriptors.\n values (List[List[float]]): List of lists representing the descriptor values.\n \"\"\"\n\n type: Literal[\"CategoricalDescriptorInput\"] = \"CategoricalDescriptorInput\"\n order_id: ClassVar[int] = 6\n\n descriptors: Descriptors\n values: Annotated[\n List[List[float]],\n Field(min_length=1),\n ]\n\n @field_validator(\"values\")\n @classmethod\n def validate_values(cls, v, info):\n \"\"\"validates the compatability of passed values for the descriptors and the defined categories\n\n Args:\n v (List[List[float]]): Nested list with descriptor values\n values (Dict): Dictionary with attributes\n\n Raises:\n ValueError: when values have different length than categories\n ValueError: when rows in values have different length than descriptors\n ValueError: when a descriptor shows no variance in the data\n\n Returns:\n List[List[float]]: Nested list with descriptor values\n \"\"\"\n if len(v) != len(info.data[\"categories\"]):\n raise ValueError(\"values must have same length as categories\")\n for row in v:\n if len(row) != len(info.data[\"descriptors\"]):\n raise ValueError(\"rows in values must have same length as descriptors\")\n a = np.array(v)\n for i, d in enumerate(info.data[\"descriptors\"]):\n if len(set(a[:, i])) == 1:\n raise ValueError(f\"No variation for descriptor {d}.\")\n return v\n\n @staticmethod\n def valid_transform_types() -> List[CategoricalEncodingEnum]:\n return [\n CategoricalEncodingEnum.ONE_HOT,\n CategoricalEncodingEnum.DUMMY,\n CategoricalEncodingEnum.ORDINAL,\n CategoricalEncodingEnum.DESCRIPTOR,\n ]\n\n def to_df(self):\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n data = dict(zip(self.categories, self.values))\n return pd.DataFrame.from_dict(data, orient=\"index\", columns=self.descriptors)\n\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().fixed_value(transform_type)\n else:\n val = self.get_allowed_categories()[0]\n return self.to_descriptor_encoding(pd.Series([val])).values[0].tolist()\n\n def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().get_bounds(transform_type, values)\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n if values is None:\n df = self.to_df().loc[self.get_allowed_categories()]\n else:\n df = self.to_df()\n lower = df.min().values.tolist() # type: ignore\n upper = df.max().values.tolist() # type: ignore\n return lower, upper\n\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n ValueError: when no variation is present or planed for a given descriptor\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = super().validate_experimental(values, strict)\n if strict:\n lower, upper = self.get_bounds(\n transform_type=CategoricalEncodingEnum.DESCRIPTOR, values=values\n )\n for i, desc in enumerate(self.descriptors):\n if lower[i] == upper[i]:\n raise ValueError(\n f\"No variation present or planned for descriptor {desc} for feature {self.key}. Remove the descriptor.\"\n )\n return values\n\n @classmethod\n def from_df(cls, key: str, df: pd.DataFrame):\n \"\"\"Creates a feature from a dataframe\n\n Args:\n key (str): The name of the feature\n df (pd.DataFrame): Categories as rows and descriptors as columns\n\n Returns:\n _type_: _description_\n \"\"\"\n return cls(\n key=key,\n categories=list(df.index),\n allowed=[True for _ in range(len(df))],\n descriptors=list(df.columns),\n values=df.values.tolist(),\n )\n\n def to_descriptor_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n return pd.DataFrame(\n data=values.map(dict(zip(self.categories, self.values))).values.tolist(), # type: ignore\n columns=[get_encoded_name(self.key, d) for d in self.descriptors],\n index=values.index,\n )\n\n def from_descriptor_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, d) for d in self.descriptors]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_df().iloc[self.allowed].to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Returns the categories to which the feature is fixed, None if the feature is not fixed
Returns:
Type DescriptionList[str]
List of categories or None
Source code inbofire/data_models/features/descriptor.py
def fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().fixed_value(transform_type)\n else:\n val = self.get_allowed_categories()[0]\n return self.to_descriptor_encoding(pd.Series([val])).values[0].tolist()\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.from_descriptor_encoding","title":"from_descriptor_encoding(self, values)
","text":"Converts values back from descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Descriptor encoded dataframe.
requiredExceptions:
Type DescriptionValueError
If descriptor columns not found in the dataframe.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/descriptor.py
def from_descriptor_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, d) for d in self.descriptors]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_df().iloc[self.allowed].to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.from_df","title":"from_df(key, df)
classmethod
","text":"Creates a feature from a dataframe
Parameters:
Name Type Description Defaultkey
str
The name of the feature
requireddf
pd.DataFrame
Categories as rows and descriptors as columns
requiredReturns:
Type Description_type_
description
Source code inbofire/data_models/features/descriptor.py
@classmethod\ndef from_df(cls, key: str, df: pd.DataFrame):\n \"\"\"Creates a feature from a dataframe\n\n Args:\n key (str): The name of the feature\n df (pd.DataFrame): Categories as rows and descriptors as columns\n\n Returns:\n _type_: _description_\n \"\"\"\n return cls(\n key=key,\n categories=list(df.index),\n allowed=[True for _ in range(len(df))],\n descriptors=list(df.columns),\n values=df.values.tolist(),\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.get_bounds","title":"get_bounds(self, transform_type, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
requiredvalues
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/descriptor.py
def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().get_bounds(transform_type, values)\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n if values is None:\n df = self.to_df().loc[self.get_allowed_categories()]\n else:\n df = self.to_df()\n lower = df.min().values.tolist() # type: ignore\n upper = df.max().values.tolist() # type: ignore\n return lower, upper\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.to_descriptor_encoding","title":"to_descriptor_encoding(self, values)
","text":"Converts values to descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Values to transform.
requiredReturns:
Type Descriptionpd.DataFrame
Descriptor encoded dataframe.
Source code inbofire/data_models/features/descriptor.py
def to_descriptor_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n return pd.DataFrame(\n data=values.map(dict(zip(self.categories, self.values))).values.tolist(), # type: ignore\n columns=[get_encoded_name(self.key, d) for d in self.descriptors],\n index=values.index,\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.to_df","title":"to_df(self)
","text":"tabular overview of the feature as DataFrame
Returns:
Type Descriptionpd.DataFrame
tabular overview of the feature as DataFrame
Source code inbofire/data_models/features/descriptor.py
def to_df(self):\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n data = dict(zip(self.categories, self.values))\n return pd.DataFrame.from_dict(data, orient=\"index\", columns=self.descriptors)\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Exceptions:
Type DescriptionValueError
when an entry is not in the list of allowed categories
ValueError
when there is no variation in a feature provided by the experimental data
ValueError
when no variation is present or planed for a given descriptor
Returns:
Type Descriptionpd.Series
A dataFrame with experiments
Source code inbofire/data_models/features/descriptor.py
def validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n ValueError: when no variation is present or planed for a given descriptor\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = super().validate_experimental(values, strict)\n if strict:\n lower, upper = self.get_bounds(\n transform_type=CategoricalEncodingEnum.DESCRIPTOR, values=values\n )\n for i, desc in enumerate(self.descriptors):\n if lower[i] == upper[i]:\n raise ValueError(\n f\"No variation present or planned for descriptor {desc} for feature {self.key}. Remove the descriptor.\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.validate_values","title":"validate_values(v, info)
classmethod
","text":"validates the compatability of passed values for the descriptors and the defined categories
Parameters:
Name Type Description Defaultv
List[List[float]]
Nested list with descriptor values
requiredvalues
Dict
Dictionary with attributes
requiredExceptions:
Type DescriptionValueError
when values have different length than categories
ValueError
when rows in values have different length than descriptors
ValueError
when a descriptor shows no variance in the data
Returns:
Type DescriptionList[List[float]]
Nested list with descriptor values
Source code inbofire/data_models/features/descriptor.py
@field_validator(\"values\")\n@classmethod\ndef validate_values(cls, v, info):\n \"\"\"validates the compatability of passed values for the descriptors and the defined categories\n\n Args:\n v (List[List[float]]): Nested list with descriptor values\n values (Dict): Dictionary with attributes\n\n Raises:\n ValueError: when values have different length than categories\n ValueError: when rows in values have different length than descriptors\n ValueError: when a descriptor shows no variance in the data\n\n Returns:\n List[List[float]]: Nested list with descriptor values\n \"\"\"\n if len(v) != len(info.data[\"categories\"]):\n raise ValueError(\"values must have same length as categories\")\n for row in v:\n if len(row) != len(info.data[\"descriptors\"]):\n raise ValueError(\"rows in values must have same length as descriptors\")\n a = np.array(v)\n for i, d in enumerate(info.data[\"descriptors\"]):\n if len(set(a[:, i])) == 1:\n raise ValueError(f\"No variation for descriptor {d}.\")\n return v\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.ContinuousDescriptorInput","title":" ContinuousDescriptorInput (ContinuousInput)
","text":"Class for continuous input features with descriptors
Attributes:
Name Type Descriptionlower_bound
float
Lower bound of the feature in the optimization.
upper_bound
float
Upper bound of the feature in the optimization.
descriptors
List[str]
Names of the descriptors.
values
List[float]
Values of the descriptors.
Source code inbofire/data_models/features/descriptor.py
class ContinuousDescriptorInput(ContinuousInput):\n \"\"\"Class for continuous input features with descriptors\n\n Attributes:\n lower_bound (float): Lower bound of the feature in the optimization.\n upper_bound (float): Upper bound of the feature in the optimization.\n descriptors (List[str]): Names of the descriptors.\n values (List[float]): Values of the descriptors.\n \"\"\"\n\n type: Literal[\"ContinuousDescriptorInput\"] = \"ContinuousDescriptorInput\"\n order_id: ClassVar[int] = 2\n\n descriptors: Descriptors\n values: DiscreteVals\n\n @model_validator(mode=\"after\")\n def validate_list_lengths(self):\n \"\"\"compares the length of the defined descriptors list with the provided values\n\n Args:\n values (Dict): Dictionary with all attribues\n\n Raises:\n ValueError: when the number of descriptors does not math the number of provided values\n\n Returns:\n Dict: Dict with the attributes\n \"\"\"\n if len(self.descriptors) != len(self.values):\n raise ValueError(\n 'must provide same number of descriptors and values, got {len(values[\"descriptors\"])} != {len(values[\"values\"])}'\n )\n return self\n\n def to_df(self) -> pd.DataFrame:\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n return pd.DataFrame(\n data=[self.values], index=[self.key], columns=self.descriptors\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.ContinuousDescriptorInput.to_df","title":"to_df(self)
","text":"tabular overview of the feature as DataFrame
Returns:
Type Descriptionpd.DataFrame
tabular overview of the feature as DataFrame
Source code inbofire/data_models/features/descriptor.py
def to_df(self) -> pd.DataFrame:\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n return pd.DataFrame(\n data=[self.values], index=[self.key], columns=self.descriptors\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.ContinuousDescriptorInput.validate_list_lengths","title":"validate_list_lengths(self)
","text":"compares the length of the defined descriptors list with the provided values
Parameters:
Name Type Description Defaultvalues
Dict
Dictionary with all attribues
requiredExceptions:
Type DescriptionValueError
when the number of descriptors does not math the number of provided values
Returns:
Type DescriptionDict
Dict with the attributes
Source code inbofire/data_models/features/descriptor.py
@model_validator(mode=\"after\")\ndef validate_list_lengths(self):\n \"\"\"compares the length of the defined descriptors list with the provided values\n\n Args:\n values (Dict): Dictionary with all attribues\n\n Raises:\n ValueError: when the number of descriptors does not math the number of provided values\n\n Returns:\n Dict: Dict with the attributes\n \"\"\"\n if len(self.descriptors) != len(self.values):\n raise ValueError(\n 'must provide same number of descriptors and values, got {len(values[\"descriptors\"])} != {len(values[\"values\"])}'\n )\n return self\n
"},{"location":"ref-features/#bofire.data_models.features.discrete","title":"discrete
","text":""},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput","title":" DiscreteInput (NumericalInput)
","text":"Feature with discretized ordinal values allowed in the optimization.
Attributes:
Name Type Descriptionkey(str)
key of the feature.
values(List[float])
the discretized allowed values during the optimization.
Source code inbofire/data_models/features/discrete.py
class DiscreteInput(NumericalInput):\n \"\"\"Feature with discretized ordinal values allowed in the optimization.\n\n Attributes:\n key(str): key of the feature.\n values(List[float]): the discretized allowed values during the optimization.\n \"\"\"\n\n type: Literal[\"DiscreteInput\"] = \"DiscreteInput\"\n order_id: ClassVar[int] = 3\n\n values: DiscreteVals\n\n @field_validator(\"values\")\n @classmethod\n def validate_values_unique(cls, values):\n \"\"\"Validates that provided values are unique.\n\n Args:\n values (List[float]): List of values\n\n Raises:\n ValueError: when values are non-unique.\n ValueError: when values contains only one entry.\n ValueError: when values is empty.\n\n Returns:\n List[values]: Sorted list of values\n \"\"\"\n if len(values) != len(set(values)):\n raise ValueError(\"Discrete values must be unique\")\n if len(values) == 1:\n raise ValueError(\n \"Fixed discrete inputs are not supported. Please use a fixed continuous input.\"\n )\n if len(values) == 0:\n raise ValueError(\"No values defined.\")\n return sorted(values)\n\n @property\n def lower_bound(self) -> float:\n \"\"\"Lower bound of the set of allowed values\"\"\"\n return min(self.values)\n\n @property\n def upper_bound(self) -> float:\n \"\"\"Upper bound of the set of allowed values\"\"\"\n return max(self.values)\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the provided candidates.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Raises error when one of the provided values is not contained in the list of allowed values.\n\n Returns:\n pd.Series: _uggested candidates for the feature\n \"\"\"\n values = super().validate_candidental(values)\n if not np.isin(values.to_numpy(), np.array(self.values)).all():\n raise ValueError(\n f\"Not allowed values in candidates for feature {self.key}.\"\n )\n return values\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key, data=np.random.default_rng(seed=seed).choice(self.values, n)\n )\n\n def from_continuous(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Rounds continuous values to the closest discrete ones.\n\n Args:\n values (pd.DataFrame): Dataframe with continuous entries.\n\n Returns:\n pd.Series: Series with discrete values.\n \"\"\"\n\n s = pd.DataFrame(\n data=np.abs(\n (values[self.key].to_numpy()[:, np.newaxis] - np.array(self.values))\n ),\n columns=self.values,\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n\n def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n ) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if values is None:\n return [self.lower_bound], [self.upper_bound] # type: ignore\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper] # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.lower_bound","title":"lower_bound: float
property
readonly
","text":"Lower bound of the set of allowed values
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.upper_bound","title":"upper_bound: float
property
readonly
","text":"Upper bound of the set of allowed values
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.from_continuous","title":"from_continuous(self, values)
","text":"Rounds continuous values to the closest discrete ones.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Dataframe with continuous entries.
requiredReturns:
Type Descriptionpd.Series
Series with discrete values.
Source code inbofire/data_models/features/discrete.py
def from_continuous(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Rounds continuous values to the closest discrete ones.\n\n Args:\n values (pd.DataFrame): Dataframe with continuous entries.\n\n Returns:\n pd.Series: Series with discrete values.\n \"\"\"\n\n s = pd.DataFrame(\n data=np.abs(\n (values[self.key].to_numpy()[:, np.newaxis] - np.array(self.values))\n ),\n columns=self.values,\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.get_bounds","title":"get_bounds(self, transform_type=None, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
None
values
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/discrete.py
def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if values is None:\n return [self.lower_bound], [self.upper_bound] # type: ignore\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper] # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.sample","title":"sample(self, n, seed=None)
","text":"Draw random samples from the feature.
Parameters:
Name Type Description Defaultn
int
number of samples.
requiredReturns:
Type Descriptionpd.Series
drawn samples.
Source code inbofire/data_models/features/discrete.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key, data=np.random.default_rng(seed=seed).choice(self.values, n)\n )\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Method to validate the provided candidates.
Parameters:
Name Type Description Defaultvalues
pd.Series
suggested candidates for the feature
requiredExceptions:
Type DescriptionValueError
Raises error when one of the provided values is not contained in the list of allowed values.
Returns:
Type Descriptionpd.Series
_uggested candidates for the feature
Source code inbofire/data_models/features/discrete.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the provided candidates.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Raises error when one of the provided values is not contained in the list of allowed values.\n\n Returns:\n pd.Series: _uggested candidates for the feature\n \"\"\"\n values = super().validate_candidental(values)\n if not np.isin(values.to_numpy(), np.array(self.values)).all():\n raise ValueError(\n f\"Not allowed values in candidates for feature {self.key}.\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.validate_values_unique","title":"validate_values_unique(values)
classmethod
","text":"Validates that provided values are unique.
Parameters:
Name Type Description Defaultvalues
List[float]
List of values
requiredExceptions:
Type DescriptionValueError
when values are non-unique.
ValueError
when values contains only one entry.
ValueError
when values is empty.
Returns:
Type DescriptionList[values]
Sorted list of values
Source code inbofire/data_models/features/discrete.py
@field_validator(\"values\")\n@classmethod\ndef validate_values_unique(cls, values):\n \"\"\"Validates that provided values are unique.\n\n Args:\n values (List[float]): List of values\n\n Raises:\n ValueError: when values are non-unique.\n ValueError: when values contains only one entry.\n ValueError: when values is empty.\n\n Returns:\n List[values]: Sorted list of values\n \"\"\"\n if len(values) != len(set(values)):\n raise ValueError(\"Discrete values must be unique\")\n if len(values) == 1:\n raise ValueError(\n \"Fixed discrete inputs are not supported. Please use a fixed continuous input.\"\n )\n if len(values) == 0:\n raise ValueError(\"No values defined.\")\n return sorted(values)\n
"},{"location":"ref-features/#bofire.data_models.features.feature","title":"feature
","text":""},{"location":"ref-features/#bofire.data_models.features.feature.Feature","title":" Feature (BaseModel)
","text":"The base class for all features.
Source code inbofire/data_models/features/feature.py
class Feature(BaseModel):\n \"\"\"The base class for all features.\"\"\"\n\n type: str\n key: str\n order_id: ClassVar[int] = -1\n\n def __lt__(self, other) -> bool:\n \"\"\"\n Method to compare two models to get them in the desired order.\n Return True if other is larger than self, else False. (see FEATURE_ORDER)\n\n Args:\n other: The other class to compare to self\n\n Returns:\n bool: True if the other class is larger than self, else False\n \"\"\"\n order_self = self.order_id\n order_other = other.order_id\n if order_self == order_other:\n return self.key < other.key\n else:\n return order_self < order_other\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Feature.__lt__","title":"__lt__(self, other)
special
","text":"Method to compare two models to get them in the desired order. Return True if other is larger than self, else False. (see FEATURE_ORDER)
Parameters:
Name Type Description Defaultother
The other class to compare to self
requiredReturns:
Type Descriptionbool
True if the other class is larger than self, else False
Source code inbofire/data_models/features/feature.py
def __lt__(self, other) -> bool:\n \"\"\"\n Method to compare two models to get them in the desired order.\n Return True if other is larger than self, else False. (see FEATURE_ORDER)\n\n Args:\n other: The other class to compare to self\n\n Returns:\n bool: True if the other class is larger than self, else False\n \"\"\"\n order_self = self.order_id\n order_other = other.order_id\n if order_self == order_other:\n return self.key < other.key\n else:\n return order_self < order_other\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input","title":" Input (Feature)
","text":"Base class for all input features.
Source code inbofire/data_models/features/feature.py
class Input(Feature):\n \"\"\"Base class for all input features.\"\"\"\n\n @staticmethod\n @abstractmethod\n def valid_transform_types() -> List[Union[CategoricalEncodingEnum, AnyMolFeatures]]:\n pass\n\n @abstractmethod\n def is_fixed(self) -> bool:\n \"\"\"Indicates if a variable is set to a fixed value.\n\n Returns:\n bool: True if fixed, els False.\n \"\"\"\n pass\n\n @abstractmethod\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[None, List[str], List[float]]:\n \"\"\"Method to return the fixed value in case of a fixed feature.\n\n Returns:\n Union[None,str,float]: None in case the feature is not fixed, else the fixed value.\n \"\"\"\n pass\n\n @abstractmethod\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n \"\"\"Abstract method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n\n @abstractmethod\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n pass\n\n @abstractmethod\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Sample a series of allowed values.\n\n Args:\n n (int): Number of samples\n\n Returns:\n pd.Series: Sampled values.\n \"\"\"\n pass\n\n @abstractmethod\n def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[Union[float, str]] = None,\n ) -> Tuple[List[float], List[float]]:\n \"\"\"Returns the bounds of an input feature depending on the requested transform type.\n\n Args:\n transform_type (Optional[TTransform], optional): The requested transform type. Defaults to None.\n values (Optional[pd.Series], optional): If values are provided the bounds are returned taking\n the most extreme values for the feature into account. Defaults to None.\n reference_value (Optional[float], optional): If a reference value is provided, then the local bounds based\n on a local search region are provided. Currently only supported for continuous inputs. For more\n details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.\n Returns:\n Tuple[List[float], List[float]]: List of lower bound values, list of upper bound values.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Method to return the fixed value in case of a fixed feature.
Returns:
Type DescriptionUnion[None,str,float]
None in case the feature is not fixed, else the fixed value.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[None, List[str], List[float]]:\n \"\"\"Method to return the fixed value in case of a fixed feature.\n\n Returns:\n Union[None,str,float]: None in case the feature is not fixed, else the fixed value.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.get_bounds","title":"get_bounds(self, transform_type=None, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
None
values
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[Union[float, str]] = None,\n) -> Tuple[List[float], List[float]]:\n \"\"\"Returns the bounds of an input feature depending on the requested transform type.\n\n Args:\n transform_type (Optional[TTransform], optional): The requested transform type. Defaults to None.\n values (Optional[pd.Series], optional): If values are provided the bounds are returned taking\n the most extreme values for the feature into account. Defaults to None.\n reference_value (Optional[float], optional): If a reference value is provided, then the local bounds based\n on a local search region are provided. Currently only supported for continuous inputs. For more\n details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.\n Returns:\n Tuple[List[float], List[float]]: List of lower bound values, list of upper bound values.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.is_fixed","title":"is_fixed(self)
","text":"Indicates if a variable is set to a fixed value.
Returns:
Type Descriptionbool
True if fixed, els False.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef is_fixed(self) -> bool:\n \"\"\"Indicates if a variable is set to a fixed value.\n\n Returns:\n bool: True if fixed, els False.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.sample","title":"sample(self, n, seed=None)
","text":"Sample a series of allowed values.
Parameters:
Name Type Description Defaultn
int
Number of samples
requiredReturns:
Type Descriptionpd.Series
Sampled values.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Sample a series of allowed values.\n\n Args:\n n (int): Number of samples\n\n Returns:\n pd.Series: Sampled values.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.validate_candidental","title":"validate_candidental(self, values)
","text":"Abstract method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Abstract method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Returns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n \"\"\"Abstract method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Output","title":" Output (Feature)
","text":"Base class for all output features.
Attributes:
Name Type Descriptionkey(str)
Key of the Feature.
Source code inbofire/data_models/features/feature.py
class Output(Feature):\n \"\"\"Base class for all output features.\n\n Attributes:\n key(str): Key of the Feature.\n \"\"\"\n\n @abstractmethod\n def __call__(self, values: pd.Series) -> pd.Series:\n pass\n\n @abstractmethod\n def validate_experimental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the experimental Series\n\n Args:\n values (pd.Series): A dataFrame with values for the outcome\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Output.validate_experimental","title":"validate_experimental(self, values)
","text":"Abstract method to validate the experimental Series
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with values for the outcome
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef validate_experimental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the experimental Series\n\n Args:\n values (pd.Series): A dataFrame with values for the outcome\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.get_encoded_name","title":"get_encoded_name(feature_key, option_name)
","text":"Get the name of the encoded column. Option could be the category or the descriptor name.
Source code inbofire/data_models/features/feature.py
def get_encoded_name(feature_key: str, option_name: str) -> str:\n \"\"\"Get the name of the encoded column. Option could be the category or the descriptor name.\"\"\"\n return f\"{feature_key}_{option_name}\"\n
"},{"location":"ref-features/#bofire.data_models.features.molecular","title":"molecular
","text":""},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput","title":" CategoricalMolecularInput (CategoricalInput, MolecularInput)
","text":"Source code in bofire/data_models/features/molecular.py
class CategoricalMolecularInput(CategoricalInput, MolecularInput):\n type: Literal[\"CategoricalMolecularInput\"] = \"CategoricalMolecularInput\"\n # order_id: ClassVar[int] = 7\n order_id: ClassVar[int] = 5\n\n @field_validator(\"categories\")\n @classmethod\n def validate_smiles(cls, categories: Sequence[str]):\n \"\"\"validates that categories are valid smiles. Note that this check can only\n be executed when rdkit is available.\n\n Args:\n categories (List[str]): List of smiles\n\n Raises:\n ValueError: when string is not a smiles\n\n Returns:\n List[str]: List of the smiles\n \"\"\"\n # check on rdkit availability:\n try:\n smiles2mol(categories[0])\n except NameError:\n warnings.warn(\"rdkit not installed, categories cannot be validated.\")\n return categories\n\n for cat in categories:\n smiles2mol(cat)\n return categories\n\n @staticmethod\n def valid_transform_types() -> List[Union[AnyMolFeatures, CategoricalEncodingEnum]]:\n return CategoricalInput.valid_transform_types() + [\n Fingerprints,\n FingerprintsFragments,\n Fragments,\n MordredDescriptors, # type: ignore\n ]\n\n def get_bounds(\n self,\n transform_type: Union[CategoricalEncodingEnum, AnyMolFeatures],\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n if isinstance(transform_type, CategoricalEncodingEnum):\n # we are just using the standard categorical transformations\n return super().get_bounds(\n transform_type=transform_type,\n values=values,\n reference_value=reference_value,\n )\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n data = self.to_descriptor_encoding(\n transform_type=transform_type,\n values=(\n pd.Series(self.get_allowed_categories())\n if values is None\n else pd.Series(self.categories)\n ),\n )\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n return lower, upper\n\n def from_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.DataFrame\n ) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n\n # This method is modified based on the categorical descriptor feature\n # TODO: move it to more central place\n cat_cols = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_descriptor_encoding(\n transform_type=transform_type,\n values=pd.Series(self.get_allowed_categories()),\n ).to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput.from_descriptor_encoding","title":"from_descriptor_encoding(self, transform_type, values)
","text":"Converts values back from descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Descriptor encoded dataframe.
requiredExceptions:
Type DescriptionValueError
If descriptor columns not found in the dataframe.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/molecular.py
def from_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.DataFrame\n) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n\n # This method is modified based on the categorical descriptor feature\n # TODO: move it to more central place\n cat_cols = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_descriptor_encoding(\n transform_type=transform_type,\n values=pd.Series(self.get_allowed_categories()),\n ).to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput.get_bounds","title":"get_bounds(self, transform_type, values=None, reference_value=None)
","text":"Calculates the lower and upper bounds for the feature based on the given transform type and values.
Parameters:
Name Type Description Defaulttransform_type
AnyMolFeatures
The type of transformation to apply to the data.
requiredvalues
pd.Series
The actual data over which the lower and upper bounds are calculated.
None
reference_value
Optional[str]
The reference value for the transformation. Not used here. Defaults to None.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
A tuple containing the lower and upper bounds of the transformed data.
Exceptions:
Type DescriptionNotImplementedError
Raised when values
is None, as it is currently required for MolecularInput
.
bofire/data_models/features/molecular.py
def get_bounds(\n self,\n transform_type: Union[CategoricalEncodingEnum, AnyMolFeatures],\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n if isinstance(transform_type, CategoricalEncodingEnum):\n # we are just using the standard categorical transformations\n return super().get_bounds(\n transform_type=transform_type,\n values=values,\n reference_value=reference_value,\n )\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n data = self.to_descriptor_encoding(\n transform_type=transform_type,\n values=(\n pd.Series(self.get_allowed_categories())\n if values is None\n else pd.Series(self.categories)\n ),\n )\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n return lower, upper\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput.validate_smiles","title":"validate_smiles(categories)
classmethod
","text":"validates that categories are valid smiles. Note that this check can only be executed when rdkit is available.
Parameters:
Name Type Description Defaultcategories
List[str]
List of smiles
requiredExceptions:
Type DescriptionValueError
when string is not a smiles
Returns:
Type DescriptionList[str]
List of the smiles
Source code inbofire/data_models/features/molecular.py
@field_validator(\"categories\")\n@classmethod\ndef validate_smiles(cls, categories: Sequence[str]):\n \"\"\"validates that categories are valid smiles. Note that this check can only\n be executed when rdkit is available.\n\n Args:\n categories (List[str]): List of smiles\n\n Raises:\n ValueError: when string is not a smiles\n\n Returns:\n List[str]: List of the smiles\n \"\"\"\n # check on rdkit availability:\n try:\n smiles2mol(categories[0])\n except NameError:\n warnings.warn(\"rdkit not installed, categories cannot be validated.\")\n return categories\n\n for cat in categories:\n smiles2mol(cat)\n return categories\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput","title":" MolecularInput (Input)
","text":"Source code in bofire/data_models/features/molecular.py
class MolecularInput(Input):\n type: Literal[\"MolecularInput\"] = \"MolecularInput\"\n # order_id: ClassVar[int] = 6\n order_id: ClassVar[int] = 4\n\n @staticmethod\n def valid_transform_types() -> List[AnyMolFeatures]:\n return [Fingerprints, FingerprintsFragments, Fragments, MordredDescriptors] # type: ignore\n\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n\n return values\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n return values\n\n def is_fixed(self) -> bool:\n return False\n\n def fixed_value(self, transform_type: Optional[AnyMolFeatures] = None) -> None:\n return None\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n raise ValueError(\"Sampling not supported for `MolecularInput`\")\n\n def get_bounds(\n self,\n transform_type: AnyMolFeatures,\n values: pd.Series,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n \"\"\"\n Calculates the lower and upper bounds for the feature based on the given transform type and values.\n\n Args:\n transform_type (AnyMolFeatures): The type of transformation to apply to the data.\n values (pd.Series): The actual data over which the lower and upper bounds are calculated.\n reference_value (Optional[str], optional): The reference value for the transformation. Not used here.\n Defaults to None.\n\n Returns:\n Tuple[List[float], List[float]]: A tuple containing the lower and upper bounds of the transformed data.\n\n Raises:\n NotImplementedError: Raised when `values` is None, as it is currently required for `MolecularInput`.\n \"\"\"\n if values is None:\n raise NotImplementedError(\n \"`values` is currently required for `MolecularInput`\"\n )\n else:\n data = self.to_descriptor_encoding(transform_type, values)\n\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n\n return lower, upper\n\n def to_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.Series\n ) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n descriptor_values = transform_type.get_descriptor_values(values)\n\n descriptor_values.columns = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n descriptor_values.index = values.index\n\n return descriptor_values\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Method to return the fixed value in case of a fixed feature.
Returns:
Type DescriptionUnion[None,str,float]
None in case the feature is not fixed, else the fixed value.
Source code inbofire/data_models/features/molecular.py
def fixed_value(self, transform_type: Optional[AnyMolFeatures] = None) -> None:\n return None\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.get_bounds","title":"get_bounds(self, transform_type, values, reference_value=None)
","text":"Calculates the lower and upper bounds for the feature based on the given transform type and values.
Parameters:
Name Type Description Defaulttransform_type
AnyMolFeatures
The type of transformation to apply to the data.
requiredvalues
pd.Series
The actual data over which the lower and upper bounds are calculated.
requiredreference_value
Optional[str]
The reference value for the transformation. Not used here. Defaults to None.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
A tuple containing the lower and upper bounds of the transformed data.
Exceptions:
Type DescriptionNotImplementedError
Raised when values
is None, as it is currently required for MolecularInput
.
bofire/data_models/features/molecular.py
def get_bounds(\n self,\n transform_type: AnyMolFeatures,\n values: pd.Series,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n \"\"\"\n Calculates the lower and upper bounds for the feature based on the given transform type and values.\n\n Args:\n transform_type (AnyMolFeatures): The type of transformation to apply to the data.\n values (pd.Series): The actual data over which the lower and upper bounds are calculated.\n reference_value (Optional[str], optional): The reference value for the transformation. Not used here.\n Defaults to None.\n\n Returns:\n Tuple[List[float], List[float]]: A tuple containing the lower and upper bounds of the transformed data.\n\n Raises:\n NotImplementedError: Raised when `values` is None, as it is currently required for `MolecularInput`.\n \"\"\"\n if values is None:\n raise NotImplementedError(\n \"`values` is currently required for `MolecularInput`\"\n )\n else:\n data = self.to_descriptor_encoding(transform_type, values)\n\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n\n return lower, upper\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.is_fixed","title":"is_fixed(self)
","text":"Indicates if a variable is set to a fixed value.
Returns:
Type Descriptionbool
True if fixed, els False.
Source code inbofire/data_models/features/molecular.py
def is_fixed(self) -> bool:\n return False\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.sample","title":"sample(self, n, seed=None)
","text":"Sample a series of allowed values.
Parameters:
Name Type Description Defaultn
int
Number of samples
requiredReturns:
Type Descriptionpd.Series
Sampled values.
Source code inbofire/data_models/features/molecular.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n raise ValueError(\"Sampling not supported for `MolecularInput`\")\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.to_descriptor_encoding","title":"to_descriptor_encoding(self, transform_type, values)
","text":"Converts values to descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Values to transform.
requiredReturns:
Type Descriptionpd.DataFrame
Descriptor encoded dataframe.
Source code inbofire/data_models/features/molecular.py
def to_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.Series\n) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n descriptor_values = transform_type.get_descriptor_values(values)\n\n descriptor_values.columns = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n descriptor_values.index = values.index\n\n return descriptor_values\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Abstract method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/molecular.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Abstract method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Returns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/molecular.py
def validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.numerical","title":"numerical
","text":""},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput","title":" NumericalInput (Input)
","text":"Abstract base class for all numerical (ordinal) input features.
Source code inbofire/data_models/features/numerical.py
class NumericalInput(Input):\n \"\"\"Abstract base class for all numerical (ordinal) input features.\"\"\"\n\n unit: Optional[str] = None\n\n @staticmethod\n def valid_transform_types() -> List:\n return []\n\n def to_unit_range(\n self, values: Union[pd.Series, np.ndarray], use_real_bounds: bool = False\n ) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert to the unit range between 0 and 1.\n\n Args:\n values (pd.Series): values to be transformed\n use_real_bounds (bool, optional): if True, use the bounds from the actual values else the bounds from the feature.\n Defaults to False.\n\n Raises:\n ValueError: If lower_bound == upper bound an error is raised\n\n Returns:\n pd.Series: transformed values.\n \"\"\"\n if use_real_bounds:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n lower = lower[0]\n upper = upper[0]\n else:\n lower, upper = self.lower_bound, self.upper_bound # type: ignore\n if lower == upper:\n raise ValueError(\"Fixed feature cannot be transformed to unit range.\")\n valrange = upper - lower\n return (values - lower) / valrange\n\n def from_unit_range(\n self, values: Union[pd.Series, np.ndarray]\n ) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert from unit range.\n\n Args:\n values (pd.Series): values to transform from.\n\n Raises:\n ValueError: if the feature is fixed raise a value error.\n\n Returns:\n pd.Series: _description_\n \"\"\"\n if self.is_fixed():\n raise ValueError(\"Fixed feature cannot be transformed from unit range.\")\n valrange = self.upper_bound - self.lower_bound # type: ignore\n return (values * valrange) + self.lower_bound # type: ignore\n\n def is_fixed(self):\n \"\"\"Method to check if the feature is fixed\n\n Returns:\n Boolean: True when the feature is fixed, false otherwise.\n \"\"\"\n return self.lower_bound == self.upper_bound # type: ignore\n\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[None, List[float]]:\n \"\"\"Method to get the value to which the feature is fixed\n\n Returns:\n Float: Return the feature value or None if the feature is not fixed.\n \"\"\"\n assert transform_type is None\n if self.is_fixed():\n return [self.lower_bound] # type: ignore\n else:\n return None\n\n def validate_experimental(self, values: pd.Series, strict=False) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not.\n Defaults to False.\n\n Raises:\n ValueError: when a value is not numerical\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n values = values.astype(\"float64\")\n if strict:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n if lower == upper:\n raise ValueError(\n f\"No variation present or planned for feature {self.key}. Remove it.\"\n )\n return values\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Validate the suggested candidates for the feature.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Error is raised when one of the values is not numerical.\n\n Returns:\n pd.Series: the original provided candidates\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Method to get the value to which the feature is fixed
Returns:
Type DescriptionFloat
Return the feature value or None if the feature is not fixed.
Source code inbofire/data_models/features/numerical.py
def fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[None, List[float]]:\n \"\"\"Method to get the value to which the feature is fixed\n\n Returns:\n Float: Return the feature value or None if the feature is not fixed.\n \"\"\"\n assert transform_type is None\n if self.is_fixed():\n return [self.lower_bound] # type: ignore\n else:\n return None\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.from_unit_range","title":"from_unit_range(self, values)
","text":"Convert from unit range.
Parameters:
Name Type Description Defaultvalues
pd.Series
values to transform from.
requiredExceptions:
Type DescriptionValueError
if the feature is fixed raise a value error.
Returns:
Type Descriptionpd.Series
description
Source code inbofire/data_models/features/numerical.py
def from_unit_range(\n self, values: Union[pd.Series, np.ndarray]\n) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert from unit range.\n\n Args:\n values (pd.Series): values to transform from.\n\n Raises:\n ValueError: if the feature is fixed raise a value error.\n\n Returns:\n pd.Series: _description_\n \"\"\"\n if self.is_fixed():\n raise ValueError(\"Fixed feature cannot be transformed from unit range.\")\n valrange = self.upper_bound - self.lower_bound # type: ignore\n return (values * valrange) + self.lower_bound # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.is_fixed","title":"is_fixed(self)
","text":"Method to check if the feature is fixed
Returns:
Type DescriptionBoolean
True when the feature is fixed, false otherwise.
Source code inbofire/data_models/features/numerical.py
def is_fixed(self):\n \"\"\"Method to check if the feature is fixed\n\n Returns:\n Boolean: True when the feature is fixed, false otherwise.\n \"\"\"\n return self.lower_bound == self.upper_bound # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.to_unit_range","title":"to_unit_range(self, values, use_real_bounds=False)
","text":"Convert to the unit range between 0 and 1.
Parameters:
Name Type Description Defaultvalues
pd.Series
values to be transformed
requireduse_real_bounds
bool
if True, use the bounds from the actual values else the bounds from the feature. Defaults to False.
False
Exceptions:
Type DescriptionValueError
If lower_bound == upper bound an error is raised
Returns:
Type Descriptionpd.Series
transformed values.
Source code inbofire/data_models/features/numerical.py
def to_unit_range(\n self, values: Union[pd.Series, np.ndarray], use_real_bounds: bool = False\n) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert to the unit range between 0 and 1.\n\n Args:\n values (pd.Series): values to be transformed\n use_real_bounds (bool, optional): if True, use the bounds from the actual values else the bounds from the feature.\n Defaults to False.\n\n Raises:\n ValueError: If lower_bound == upper bound an error is raised\n\n Returns:\n pd.Series: transformed values.\n \"\"\"\n if use_real_bounds:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n lower = lower[0]\n upper = upper[0]\n else:\n lower, upper = self.lower_bound, self.upper_bound # type: ignore\n if lower == upper:\n raise ValueError(\"Fixed feature cannot be transformed to unit range.\")\n valrange = upper - lower\n return (values - lower) / valrange\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Validate the suggested candidates for the feature.
Parameters:
Name Type Description Defaultvalues
pd.Series
suggested candidates for the feature
requiredExceptions:
Type DescriptionValueError
Error is raised when one of the values is not numerical.
Returns:
Type Descriptionpd.Series
the original provided candidates
Source code inbofire/data_models/features/numerical.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Validate the suggested candidates for the feature.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Error is raised when one of the values is not numerical.\n\n Returns:\n pd.Series: the original provided candidates\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Exceptions:
Type DescriptionValueError
when a value is not numerical
ValueError
when there is no variation in a feature provided by the experimental data
Returns:
Type Descriptionpd.Series
A dataFrame with experiments
Source code inbofire/data_models/features/numerical.py
def validate_experimental(self, values: pd.Series, strict=False) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not.\n Defaults to False.\n\n Raises:\n ValueError: when a value is not numerical\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n values = values.astype(\"float64\")\n if strict:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n if lower == upper:\n raise ValueError(\n f\"No variation present or planned for feature {self.key}. Remove it.\"\n )\n return values\n
"},{"location":"ref-objectives/","title":"Domain","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.categorical","title":"categorical
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective","title":" ConstrainedCategoricalObjective (ConstrainedObjective, Objective)
","text":"Compute the categorical objective value as:
Po where P is an [n, c] matrix where each row is a probability vector\n(P[i, :].sum()=1 for all i) and o is a vector of size [c] of objective values\n
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
desirability
list
list of values of size c (c is number of categories) such that the i-th entry is in {True, False}
Source code inbofire/data_models/objectives/categorical.py
class ConstrainedCategoricalObjective(ConstrainedObjective, Objective):\n \"\"\"Compute the categorical objective value as:\n\n Po where P is an [n, c] matrix where each row is a probability vector\n (P[i, :].sum()=1 for all i) and o is a vector of size [c] of objective values\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n desirability (list): list of values of size c (c is number of categories) such that the i-th entry is in {True, False}\n \"\"\"\n\n w: TWeight = 1.0\n categories: CategoryVals\n desirability: List[bool]\n type: Literal[\"ConstrainedCategoricalObjective\"] = \"ConstrainedCategoricalObjective\"\n\n @model_validator(mode=\"after\")\n def validate_desireability(self):\n \"\"\"validates that categories have unique names\n\n Args:\n categories (List[str]): List or tuple of category names\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n Tuple[str]: Tuple of the categories\n \"\"\"\n if len(self.desirability) != len(self.categories):\n raise ValueError(\n \"number of categories differs from number of desirabilities\"\n )\n return self\n\n def to_dict(self) -> Dict:\n \"\"\"Returns the categories and corresponding objective values as dictionary\"\"\"\n return dict(zip(self.categories, self.desirability))\n\n def to_dict_label(self) -> Dict:\n \"\"\"Returns the catergories and label location of categories\"\"\"\n return {c: i for i, c in enumerate(self.categories)}\n\n def from_dict_label(self) -> Dict:\n \"\"\"Returns the label location and the categories\"\"\"\n d = self.to_dict_label()\n return dict(zip(d.values(), d.keys()))\n\n def __call__(\n self, x: Union[pd.Series, np.ndarray]\n ) -> Union[pd.Series, np.ndarray, float]:\n \"\"\"The call function returning a probabilistic reward for x.\n\n Args:\n x (np.ndarray): A matrix of x values\n\n Returns:\n np.ndarray: A reward calculated as inner product of probabilities and feasible objectives.\n \"\"\"\n return np.dot(x, np.array(self.desirability))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a probabilistic reward for x.
Parameters:
Name Type Description Defaultx
np.ndarray
A matrix of x values
requiredReturns:
Type Descriptionnp.ndarray
A reward calculated as inner product of probabilities and feasible objectives.
Source code inbofire/data_models/objectives/categorical.py
def __call__(\n self, x: Union[pd.Series, np.ndarray]\n) -> Union[pd.Series, np.ndarray, float]:\n \"\"\"The call function returning a probabilistic reward for x.\n\n Args:\n x (np.ndarray): A matrix of x values\n\n Returns:\n np.ndarray: A reward calculated as inner product of probabilities and feasible objectives.\n \"\"\"\n return np.dot(x, np.array(self.desirability))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.from_dict_label","title":"from_dict_label(self)
","text":"Returns the label location and the categories
Source code inbofire/data_models/objectives/categorical.py
def from_dict_label(self) -> Dict:\n \"\"\"Returns the label location and the categories\"\"\"\n d = self.to_dict_label()\n return dict(zip(d.values(), d.keys()))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.to_dict","title":"to_dict(self)
","text":"Returns the categories and corresponding objective values as dictionary
Source code inbofire/data_models/objectives/categorical.py
def to_dict(self) -> Dict:\n \"\"\"Returns the categories and corresponding objective values as dictionary\"\"\"\n return dict(zip(self.categories, self.desirability))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.to_dict_label","title":"to_dict_label(self)
","text":"Returns the catergories and label location of categories
Source code inbofire/data_models/objectives/categorical.py
def to_dict_label(self) -> Dict:\n \"\"\"Returns the catergories and label location of categories\"\"\"\n return {c: i for i, c in enumerate(self.categories)}\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.validate_desireability","title":"validate_desireability(self)
","text":"validates that categories have unique names
Parameters:
Name Type Description Defaultcategories
List[str]
List or tuple of category names
requiredExceptions:
Type DescriptionValueError
when categories do not match objective categories
Returns:
Type DescriptionTuple[str]
Tuple of the categories
Source code inbofire/data_models/objectives/categorical.py
@model_validator(mode=\"after\")\ndef validate_desireability(self):\n \"\"\"validates that categories have unique names\n\n Args:\n categories (List[str]): List or tuple of category names\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n Tuple[str]: Tuple of the categories\n \"\"\"\n if len(self.desirability) != len(self.categories):\n raise ValueError(\n \"number of categories differs from number of desirabilities\"\n )\n return self\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity","title":"identity
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.identity.IdentityObjective","title":" IdentityObjective (Objective)
","text":"An objective returning the identity as reward. The return can be scaled, when a lower and upper bound are provided.
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective
bounds
Tuple[float]
Bound for normalizing the objective between zero and one. Defaults to (0,1).
Source code inbofire/data_models/objectives/identity.py
class IdentityObjective(Objective):\n \"\"\"An objective returning the identity as reward.\n The return can be scaled, when a lower and upper bound are provided.\n\n Attributes:\n w (float): float between zero and one for weighting the objective\n bounds (Tuple[float], optional): Bound for normalizing the objective between zero and one. Defaults to (0,1).\n \"\"\"\n\n type: Literal[\"IdentityObjective\"] = \"IdentityObjective\"\n w: TWeight = 1\n bounds: Tuple[float, float] = (0, 1)\n\n @property\n def lower_bound(self) -> float:\n return self.bounds[0]\n\n @property\n def upper_bound(self) -> float:\n return self.bounds[1]\n\n @field_validator(\"bounds\")\n @classmethod\n def validate_lower_upper(cls, bounds):\n \"\"\"Validation function to ensure that lower bound is always greater the upper bound\n\n Args:\n values (Dict): The attributes of the class\n\n Raises:\n ValueError: when a lower bound higher than the upper bound is passed\n\n Returns:\n Dict: The attributes of the class\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.IdentityObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a reward for passed x values
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
The identity as reward, might be normalized to the passed lower and upper bounds
Source code inbofire/data_models/objectives/identity.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.IdentityObjective.validate_lower_upper","title":"validate_lower_upper(bounds)
classmethod
","text":"Validation function to ensure that lower bound is always greater the upper bound
Parameters:
Name Type Description Defaultvalues
Dict
The attributes of the class
requiredExceptions:
Type DescriptionValueError
when a lower bound higher than the upper bound is passed
Returns:
Type DescriptionDict
The attributes of the class
Source code inbofire/data_models/objectives/identity.py
@field_validator(\"bounds\")\n@classmethod\ndef validate_lower_upper(cls, bounds):\n \"\"\"Validation function to ensure that lower bound is always greater the upper bound\n\n Args:\n values (Dict): The attributes of the class\n\n Raises:\n ValueError: when a lower bound higher than the upper bound is passed\n\n Returns:\n Dict: The attributes of the class\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.MaximizeObjective","title":" MaximizeObjective (IdentityObjective)
","text":"Child class from the identity function without modifications, since the parent class is already defined as maximization
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective
bounds
Tuple[float]
Bound for normalizing the objective between zero and one. Defaults to (0,1).
Source code inbofire/data_models/objectives/identity.py
class MaximizeObjective(IdentityObjective):\n \"\"\"Child class from the identity function without modifications, since the parent class is already defined as maximization\n\n Attributes:\n w (float): float between zero and one for weighting the objective\n bounds (Tuple[float], optional): Bound for normalizing the objective between zero and one. Defaults to (0,1).\n \"\"\"\n\n type: Literal[\"MaximizeObjective\"] = \"MaximizeObjective\"\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.MinimizeObjective","title":" MinimizeObjective (IdentityObjective)
","text":"Class returning the negative identity as reward.
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective
bounds
Tuple[float]
Bound for normalizing the objective between zero and one. Defaults to (0,1).
Source code inbofire/data_models/objectives/identity.py
class MinimizeObjective(IdentityObjective):\n \"\"\"Class returning the negative identity as reward.\n\n Attributes:\n w (float): float between zero and one for weighting the objective\n bounds (Tuple[float], optional): Bound for normalizing the objective between zero and one. Defaults to (0,1).\n \"\"\"\n\n type: Literal[\"MinimizeObjective\"] = \"MinimizeObjective\"\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The negative identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return -1.0 * (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.MinimizeObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a reward for passed x values
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
The negative identity as reward, might be normalized to the passed lower and upper bounds
Source code inbofire/data_models/objectives/identity.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The negative identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return -1.0 * (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.objective","title":"objective
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.objective.ConstrainedObjective","title":" ConstrainedObjective
","text":"This abstract class offers a convenience routine for transforming sigmoid based objectives to botorch output constraints.
Source code inbofire/data_models/objectives/objective.py
class ConstrainedObjective:\n \"\"\"This abstract class offers a convenience routine for transforming sigmoid based objectives to botorch output constraints.\"\"\"\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.objective.Objective","title":" Objective (BaseModel)
","text":"The base class for all objectives
Source code inbofire/data_models/objectives/objective.py
class Objective(BaseModel):\n \"\"\"The base class for all objectives\"\"\"\n\n type: str\n\n @abstractmethod\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"Abstract method to define the call function for the class Objective\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The desirability of the passed x values\n \"\"\"\n pass\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.objective.Objective.__call__","title":"__call__(self, x)
special
","text":"Abstract method to define the call function for the class Objective
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
The desirability of the passed x values
Source code inbofire/data_models/objectives/objective.py
@abstractmethod\ndef __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"Abstract method to define the call function for the class Objective\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The desirability of the passed x values\n \"\"\"\n pass\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid","title":"sigmoid
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MaximizeSigmoidObjective","title":" MaximizeSigmoidObjective (SigmoidObjective)
","text":"Class for a maximizing sigmoid objective
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
tp
float
Turning point of the sigmoid function.
Source code inbofire/data_models/objectives/sigmoid.py
class MaximizeSigmoidObjective(SigmoidObjective):\n \"\"\"Class for a maximizing sigmoid objective\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n tp (float): Turning point of the sigmoid function.\n\n \"\"\"\n\n type: Literal[\"MaximizeSigmoidObjective\"] = \"MaximizeSigmoidObjective\"\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MaximizeSigmoidObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a sigmoid shaped reward for passed x values.
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.
Source code inbofire/data_models/objectives/sigmoid.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MinimizeSigmoidObjective","title":" MinimizeSigmoidObjective (SigmoidObjective)
","text":"Class for a minimizing a sigmoid objective
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
tp
float
Turning point of the sigmoid function.
Source code inbofire/data_models/objectives/sigmoid.py
class MinimizeSigmoidObjective(SigmoidObjective):\n \"\"\"Class for a minimizing a sigmoid objective\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n tp (float): Turning point of the sigmoid function.\n \"\"\"\n\n type: Literal[\"MinimizeSigmoidObjective\"] = \"MinimizeSigmoidObjective\"\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 - 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MinimizeSigmoidObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a sigmoid shaped reward for passed x values.
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.
Source code inbofire/data_models/objectives/sigmoid.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 - 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.SigmoidObjective","title":" SigmoidObjective (Objective, ConstrainedObjective)
","text":"Base class for all sigmoid shaped objectives
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
tp
float
Turning point of the sigmoid function.
Source code inbofire/data_models/objectives/sigmoid.py
class SigmoidObjective(Objective, ConstrainedObjective):\n \"\"\"Base class for all sigmoid shaped objectives\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n tp (float): Turning point of the sigmoid function.\n \"\"\"\n\n steepness: TGt0\n tp: float\n w: TWeight = 1\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.target","title":"target
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.target.CloseToTargetObjective","title":" CloseToTargetObjective (Objective)
","text":"Optimize towards a target value. It can be used as objective in multiobjective scenarios.
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
target_value
float
target value that should be reached.
exponent
float
the exponent of the expression.
Source code inbofire/data_models/objectives/target.py
class CloseToTargetObjective(Objective):\n \"\"\"Optimize towards a target value. It can be used as objective\n in multiobjective scenarios.\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n target_value (float): target value that should be reached.\n exponent (float): the exponent of the expression.\n \"\"\"\n\n type: Literal[\"CloseToTargetObjective\"] = \"CloseToTargetObjective\"\n w: TWeight = 1\n target_value: float\n exponent: float\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n return -1 * (np.abs(x - self.target_value) ** self.exponent)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.target.TargetObjective","title":" TargetObjective (Objective, ConstrainedObjective)
","text":"Class for objectives for optimizing towards a target value
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
target_value
float
target value that should be reached.
tolerance
float
Tolerance for reaching the target. Has to be greater than zero.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
Source code inbofire/data_models/objectives/target.py
class TargetObjective(Objective, ConstrainedObjective):\n \"\"\"Class for objectives for optimizing towards a target value\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n target_value (float): target value that should be reached.\n tolerance (float): Tolerance for reaching the target. Has to be greater than zero.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n\n \"\"\"\n\n type: Literal[\"TargetObjective\"] = \"TargetObjective\"\n w: TWeight = 1\n target_value: float\n tolerance: TGe0\n steepness: TGt0\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values.\n\n Args:\n x (np.array): An array of x values\n\n Returns:\n np.array: An array of reward values calculated by the product of two sigmoidal shaped functions resulting in a maximum at the target value.\n \"\"\"\n return (\n 1\n / (\n 1\n + np.exp(\n -1 * self.steepness * (x - (self.target_value - self.tolerance))\n )\n )\n * (\n 1\n - 1\n / (\n 1.0\n + np.exp(\n -1 * self.steepness * (x - (self.target_value + self.tolerance))\n )\n )\n )\n )\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.target.TargetObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a reward for passed x values.
Parameters:
Name Type Description Defaultx
np.array
An array of x values
requiredReturns:
Type Descriptionnp.array
An array of reward values calculated by the product of two sigmoidal shaped functions resulting in a maximum at the target value.
Source code inbofire/data_models/objectives/target.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values.\n\n Args:\n x (np.array): An array of x values\n\n Returns:\n np.array: An array of reward values calculated by the product of two sigmoidal shaped functions resulting in a maximum at the target value.\n \"\"\"\n return (\n 1\n / (\n 1\n + np.exp(\n -1 * self.steepness * (x - (self.target_value - self.tolerance))\n )\n )\n * (\n 1\n - 1\n / (\n 1.0\n + np.exp(\n -1 * self.steepness * (x - (self.target_value + self.tolerance))\n )\n )\n )\n )\n
"},{"location":"ref-utils/","title":"Utils","text":""},{"location":"ref-utils/#bofire.utils.cheminformatics","title":"cheminformatics
","text":""},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2fingerprints","title":"smiles2fingerprints(smiles, bond_radius=5, n_bits=2048)
","text":"Transforms a list of smiles to an array of morgan fingerprints.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredbond_radius
int
Bond radius to use. Defaults to 5.
5
n_bits
int
Number of bits. Defaults to 2048.
2048
Returns:
Type Descriptionnp.ndarray
Numpy array holding the fingerprints
Source code inbofire/utils/cheminformatics.py
def smiles2fingerprints(\n smiles: List[str], bond_radius: int = 5, n_bits: int = 2048\n) -> np.ndarray:\n \"\"\"Transforms a list of smiles to an array of morgan fingerprints.\n\n Args:\n smiles (List[str]): List of smiles\n bond_radius (int, optional): Bond radius to use. Defaults to 5.\n n_bits (int, optional): Number of bits. Defaults to 2048.\n\n Returns:\n np.ndarray: Numpy array holding the fingerprints\n \"\"\"\n rdkit_mols = [smiles2mol(m) for m in smiles]\n fps = [\n AllChem.GetMorganFingerprintAsBitVect( # type: ignore\n mol, radius=bond_radius, nBits=n_bits\n )\n for mol in rdkit_mols\n ]\n\n return np.asarray(fps)\n
"},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2fragments","title":"smiles2fragments(smiles, fragments_list=None)
","text":"Transforms smiles to an array of fragments.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredReturns:
Type Descriptionnp.ndarray
Array holding the fragment information.
Source code inbofire/utils/cheminformatics.py
def smiles2fragments(\n smiles: List[str], fragments_list: Optional[List[str]] = None\n) -> np.ndarray:\n \"\"\"Transforms smiles to an array of fragments.\n\n Args:\n smiles (List[str]): List of smiles\n\n Returns:\n np.ndarray: Array holding the fragment information.\n \"\"\"\n rdkit_fragment_list = [\n item for item in Descriptors.descList if item[0].startswith(\"fr_\")\n ]\n if fragments_list is None:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list}\n else:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list if d[0] in fragments_list}\n\n frags = np.zeros((len(smiles), len(fragments)))\n for i, smi in enumerate(smiles):\n mol = smiles2mol(smi)\n features = [fragments[d](mol) for d in fragments]\n frags[i, :] = features\n\n return frags\n
"},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2mol","title":"smiles2mol(smiles)
","text":"Transforms a smiles string to an rdkit mol object.
Parameters:
Name Type Description Defaultsmiles
str
Smiles string.
requiredExceptions:
Type DescriptionValueError
If string is not a valid smiles.
Returns:
Type Descriptionrdkit.Mol
rdkit.mol object
Source code inbofire/utils/cheminformatics.py
def smiles2mol(smiles: str):\n \"\"\"Transforms a smiles string to an rdkit mol object.\n\n Args:\n smiles (str): Smiles string.\n\n Raises:\n ValueError: If string is not a valid smiles.\n\n Returns:\n rdkit.Mol: rdkit.mol object\n \"\"\"\n mol = MolFromSmiles(smiles)\n if mol is None:\n raise ValueError(f\"{smiles} is not a valid smiles string.\")\n return mol\n
"},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2mordred","title":"smiles2mordred(smiles, descriptors_list)
","text":"Transforms list of smiles to mordred moelcular descriptors.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requireddescriptors_list
List[str]
List of desired mordred descriptors
requiredReturns:
Type Descriptionnp.ndarray
Array holding the mordred moelcular descriptors.
Source code inbofire/utils/cheminformatics.py
def smiles2mordred(smiles: List[str], descriptors_list: List[str]) -> np.ndarray:\n \"\"\"Transforms list of smiles to mordred moelcular descriptors.\n\n Args:\n smiles (List[str]): List of smiles\n descriptors_list (List[str]): List of desired mordred descriptors\n\n Returns:\n np.ndarray: Array holding the mordred moelcular descriptors.\n \"\"\"\n mols = [smiles2mol(smi) for smi in smiles]\n\n calc = Calculator(descriptors, ignore_3D=True)\n calc.descriptors = [d for d in calc.descriptors if str(d) in descriptors_list]\n\n descriptors_df = calc.pandas(mols)\n nan_list = [\n pd.to_numeric(descriptors_df[col], errors=\"coerce\").isnull().values.any()\n for col in descriptors_df.columns\n ]\n if any(nan_list):\n raise ValueError(\n f\"Found NaN values in descriptors {list(descriptors_df.columns[nan_list])}\"\n )\n\n return descriptors_df.astype(float).values\n
"},{"location":"ref-utils/#bofire.utils.doe","title":"doe
","text":""},{"location":"ref-utils/#bofire.utils.doe.ff2n","title":"ff2n(n_factors)
","text":"Computes the full factorial design for a given number of factors.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredReturns:
Type Descriptionndarray
The full factorial design.
Source code inbofire/utils/doe.py
def ff2n(n_factors: int) -> np.ndarray:\n \"\"\"Computes the full factorial design for a given number of factors.\n\n Args:\n n_factors: The number of factors.\n\n Returns:\n The full factorial design.\n \"\"\"\n return np.array(list(itertools.product([-1, 1], repeat=n_factors)))\n
"},{"location":"ref-utils/#bofire.utils.doe.fracfact","title":"fracfact(gen)
","text":"Computes the fractional factorial design for a given generator.
Parameters:
Name Type Description Defaultgen
The generator.
requiredReturns:
Type Descriptionndarray
The fractional factorial design.
Source code inbofire/utils/doe.py
def fracfact(gen) -> np.ndarray:\n \"\"\"Computes the fractional factorial design for a given generator.\n\n Args:\n gen: The generator.\n\n Returns:\n The fractional factorial design.\n \"\"\"\n gen = validate_generator(n_factors=gen.count(\" \") + 1, generator=gen)\n\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", gen) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # Check if there are \"-\" operators in gen\n idx_negative = [\n i for i, item in enumerate(gen.split(\" \")) if item[0] == \"-\"\n ] # remove empty strings\n\n # Fill in design with two level factorial design\n H1 = ff2n(len(idx_main))\n H = np.zeros((H1.shape[0], len(lengthes)))\n H[:, idx_main] = H1\n\n # Recognize combinations and fill in the rest of matrix H2 with the proper\n # products\n for k in idx_combi:\n # For lowercase letters\n xx = np.array([ord(c) for c in generators[k]]) - 97\n\n H[:, k] = np.prod(H1[:, xx], axis=1)\n\n # Update design if gen includes \"-\" operator\n if len(idx_negative) > 0:\n H[:, idx_negative] *= -1\n\n # Return the fractional factorial design\n return H\n
"},{"location":"ref-utils/#bofire.utils.doe.get_alias_structure","title":"get_alias_structure(gen, order=4)
","text":"Computes the alias structure of the design matrix. Works only for generators with positive signs.
Parameters:
Name Type Description Defaultgen
str
The generator.
requiredorder
int
The order up to wich the alias structure should be calculated. Defaults to 4.
4
Returns:
Type DescriptionList[str]
The alias structure of the design matrix.
Source code inbofire/utils/doe.py
def get_alias_structure(gen: str, order: int = 4) -> List[str]:\n \"\"\"Computes the alias structure of the design matrix. Works only for generators\n with positive signs.\n\n Args:\n gen: The generator.\n order: The order up to wich the alias structure should be calculated. Defaults to 4.\n\n Returns:\n The alias structure of the design matrix.\n \"\"\"\n design = fracfact(gen)\n\n n_experiments, n_factors = design.shape\n\n all_names = string.ascii_lowercase + \"I\"\n factors = range(n_factors)\n all_combinations = itertools.chain.from_iterable(\n (\n itertools.combinations(factors, n)\n for n in range(1, min(n_factors, order) + 1)\n )\n )\n aliases = {n_experiments * \"+\": [(26,)]} # 26 is mapped to I\n\n for combination in all_combinations:\n # positive sign\n contrast = np.prod(\n design[:, combination], axis=1\n ) # this is the product of the combination\n scontrast = \"\".join(np.where(contrast == 1, \"+\", \"-\").tolist())\n aliases[scontrast] = aliases.get(scontrast, [])\n aliases[scontrast].append(combination) # type: ignore\n\n aliases_list = []\n for alias in aliases.values():\n aliases_list.append(\n sorted(alias, key=lambda a: (len(a), a))\n ) # sort by length and then by the combination\n aliases_list = sorted(\n aliases_list, key=lambda list: ([len(a) for a in list], list)\n ) # sort by the length of the alias\n\n aliases_readable = []\n\n for alias in aliases_list:\n aliases_readable.append(\n \" = \".join([\"\".join([all_names[f] for f in a]) for a in alias])\n )\n\n return aliases_readable\n
"},{"location":"ref-utils/#bofire.utils.doe.get_confounding_matrix","title":"get_confounding_matrix(inputs, design, powers=None, interactions=None)
","text":"Analyzes the confounding of a design and returns the confounding matrix.
Only takes continuous features into account.
Parameters:
Name Type Description Defaultinputs
Inputs
Input features.
requireddesign
pd.DataFrame
Design matrix.
requiredpowers
List[int]
List of powers of the individual factors/features that should be considered. Integers has to be larger than 1. Defaults to [].
None
interactions
List[int]
List with interaction levels to be considered. Integers has to be larger than 1. Defaults to [2].
None
Returns:
Type Description_type_
description
Source code inbofire/utils/doe.py
def get_confounding_matrix(\n inputs: Inputs,\n design: pd.DataFrame,\n powers: Optional[List[int]] = None,\n interactions: Optional[List[int]] = None,\n):\n \"\"\"Analyzes the confounding of a design and returns the confounding matrix.\n\n Only takes continuous features into account.\n\n Args:\n inputs (Inputs): Input features.\n design (pd.DataFrame): Design matrix.\n powers (List[int], optional): List of powers of the individual factors/features that should be considered.\n Integers has to be larger than 1. Defaults to [].\n interactions (List[int], optional): List with interaction levels to be considered.\n Integers has to be larger than 1. Defaults to [2].\n\n Returns:\n _type_: _description_\n \"\"\"\n from sklearn.preprocessing import MinMaxScaler\n\n if len(inputs.get(CategoricalInput)) > 0:\n warnings.warn(\"Categorical input features will be ignored.\")\n\n keys = inputs.get_keys(ContinuousInput)\n scaler = MinMaxScaler(feature_range=(-1, 1))\n scaled_design = pd.DataFrame(\n data=scaler.fit_transform(design[keys]),\n columns=keys,\n )\n\n # add powers\n if powers is not None:\n for p in powers:\n assert p > 1, \"Power has to be at least of degree two.\"\n for key in keys:\n scaled_design[f\"{key}**{p}\"] = scaled_design[key] ** p\n\n # add interactions\n if interactions is None:\n interactions = [2]\n\n for i in interactions:\n assert i > 1, \"Interaction has to be at least of degree two.\"\n assert i < len(keys) + 1, f\"Interaction has to be smaller than {len(keys)+1}.\"\n for combi in itertools.combinations(keys, i):\n scaled_design[\":\".join(combi)] = scaled_design[list(combi)].prod(axis=1)\n\n return scaled_design.corr()\n
"},{"location":"ref-utils/#bofire.utils.doe.get_generator","title":"get_generator(n_factors, n_generators)
","text":"Computes a generator for a given number of factors and generators.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredn_generators
int
The number of generators.
requiredReturns:
Type Descriptionstr
The generator.
Source code inbofire/utils/doe.py
def get_generator(n_factors: int, n_generators: int) -> str:\n \"\"\"Computes a generator for a given number of factors and generators.\n\n Args:\n n_factors: The number of factors.\n n_generators: The number of generators.\n\n Returns:\n The generator.\n \"\"\"\n if n_generators == 0:\n return \" \".join(list(string.ascii_lowercase[:n_factors]))\n n_base_factors = n_factors - n_generators\n if n_generators == 1:\n if n_base_factors == 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(\n list(string.ascii_lowercase[:n_base_factors])\n + [string.ascii_lowercase[:n_base_factors]]\n )\n n_base_factors = n_factors - n_generators\n if n_base_factors - 1 < 2:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n generators = [\n \"\".join(i)\n for i in (\n itertools.combinations(\n string.ascii_lowercase[:n_base_factors], n_base_factors - 1\n )\n )\n ]\n if len(generators) > n_generators:\n generators = generators[:n_generators]\n elif (n_generators - len(generators) == 1) and (n_base_factors > 1):\n generators += [string.ascii_lowercase[:n_base_factors]]\n elif n_generators - len(generators) >= 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(list(string.ascii_lowercase[:n_base_factors]) + generators)\n
"},{"location":"ref-utils/#bofire.utils.doe.validate_generator","title":"validate_generator(n_factors, generator)
","text":"Validates the generator and thows an error if it is not valid.
Source code inbofire/utils/doe.py
def validate_generator(n_factors: int, generator: str) -> str:\n \"\"\"Validates the generator and thows an error if it is not valid.\"\"\"\n\n if len(generator.split(\" \")) != n_factors:\n raise ValueError(\"Generator does not match the number of factors.\")\n # clean it and transform it into a list\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", generator) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n if len(idx_main) == 0:\n raise ValueError(\"At least one unconfounded main factor is needed.\")\n\n # Check that single letters (main factors) are unique\n if len(idx_main) != len({generators[i] for i in idx_main}):\n raise ValueError(\"Main factors are confounded with each other.\")\n\n # Check that single letters (main factors) follow the alphabet\n if (\n \"\".join(sorted([generators[i] for i in idx_main]))\n != string.ascii_lowercase[: len(idx_main)]\n ):\n raise ValueError(\n f'Use the letters `{\" \".join(string.ascii_lowercase[: len(idx_main)])}` for the main factors.'\n )\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # check that main factors come before combinations\n if min(idx_combi) > max(idx_main):\n raise ValueError(\"Main factors have to come before combinations.\")\n\n # Check that letter combinations are unique\n if len(idx_combi) != len({generators[i] for i in idx_combi}):\n raise ValueError(\"Generators are not unique.\")\n\n # Check that only letters are used in the combinations that are also single letters (main factors)\n if not all(\n set(item).issubset({generators[i] for i in idx_main})\n for item in [generators[i] for i in idx_combi]\n ):\n raise ValueError(\"Generators are not valid.\")\n\n return generator\n
"},{"location":"ref-utils/#bofire.utils.multiobjective","title":"multiobjective
","text":""},{"location":"ref-utils/#bofire.utils.multiobjective.get_ref_point_mask","title":"get_ref_point_mask(domain, output_feature_keys=None)
","text":"Method to get a mask for the reference points taking into account if we want to maximize or minimize an objective. In case it is maximize the value in the mask is 1, in case we want to minimize it is -1.
Parameters:
Name Type Description Defaultdomain
Domain
Domain for which the mask should be generated.
requiredoutput_feature_keys
Optional[list]
Name of output feature keys that should be considered in the mask. Defaults to None.
None
Returns:
Type Descriptionnp.ndarray
description
Source code inbofire/utils/multiobjective.py
def get_ref_point_mask(\n domain: Domain, output_feature_keys: Optional[list] = None\n) -> np.ndarray:\n \"\"\"Method to get a mask for the reference points taking into account if we\n want to maximize or minimize an objective. In case it is maximize the value\n in the mask is 1, in case we want to minimize it is -1.\n\n Args:\n domain (Domain): Domain for which the mask should be generated.\n output_feature_keys (Optional[list], optional): Name of output feature keys\n that should be considered in the mask. Defaults to None.\n\n Returns:\n np.ndarray: _description_\n \"\"\"\n if output_feature_keys is None:\n output_feature_keys = domain.outputs.get_keys_by_objective(\n includes=[MaximizeObjective, MinimizeObjective, CloseToTargetObjective]\n )\n if len(output_feature_keys) < 2:\n raise ValueError(\"At least two output features have to be provided.\")\n mask = []\n for key in output_feature_keys:\n feat = domain.outputs.get_by_key(key)\n if isinstance(feat.objective, MaximizeObjective): # type: ignore\n mask.append(1.0)\n elif isinstance(feat.objective, MinimizeObjective): # type: ignore\n mask.append(-1.0)\n elif isinstance(feat.objective, CloseToTargetObjective): # type: ignore\n mask.append(-1.0)\n else:\n raise ValueError(\n \"Only `MaximizeObjective` and `MinimizeObjective` supported\"\n )\n return np.array(mask)\n
"},{"location":"ref-utils/#bofire.utils.naming_conventions","title":"naming_conventions
","text":""},{"location":"ref-utils/#bofire.utils.naming_conventions.get_column_names","title":"get_column_names(outputs)
","text":"Specifies column names for given Outputs type.
Parameters:
Name Type Description Defaultoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type DescriptionTuple[List[str], List[str]]
A tuple containing the prediction column names and the standard deviation column names
Source code inbofire/utils/naming_conventions.py
def get_column_names(outputs: Outputs) -> Tuple[List[str], List[str]]:\n \"\"\"\n Specifies column names for given Outputs type.\n\n Args:\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n Tuple[List[str], List[str]]: A tuple containing the prediction column names and the standard deviation column names\n \"\"\"\n pred_cols, sd_cols = [], []\n for featkey in outputs.get_keys(CategoricalOutput): # type: ignore\n pred_cols = pred_cols + [\n f\"{featkey}_{cat}_prob\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n sd_cols = sd_cols + [\n f\"{featkey}_{cat}_sd\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n for featkey in outputs.get_keys(ContinuousOutput): # type: ignore\n pred_cols = pred_cols + [f\"{featkey}_pred\"]\n sd_cols = sd_cols + [f\"{featkey}_sd\"]\n\n return pred_cols, sd_cols\n
"},{"location":"ref-utils/#bofire.utils.naming_conventions.postprocess_categorical_predictions","title":"postprocess_categorical_predictions(predictions, outputs)
","text":"Postprocess categorical predictions by finding the maximum probability location
Parameters:
Name Type Description Defaultpredictions
pd.DataFrame
The dataframe containing the predictions.
requiredoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type Descriptionpredictions (pd.DataFrame)
The (potentially modified) original dataframe with categorical predictions added
Source code inbofire/utils/naming_conventions.py
def postprocess_categorical_predictions(predictions: pd.DataFrame, outputs: Outputs) -> pd.DataFrame: # type: ignore\n \"\"\"\n Postprocess categorical predictions by finding the maximum probability location\n\n Args:\n predictions (pd.DataFrame): The dataframe containing the predictions.\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n predictions (pd.DataFrame): The (potentially modified) original dataframe with categorical predictions added\n \"\"\"\n for feat in outputs.get():\n if isinstance(feat, CategoricalOutput): # type: ignore\n predictions.insert(\n loc=0,\n column=f\"{feat.key}_pred\",\n value=predictions.filter(regex=f\"{feat.key}(.*)_prob\")\n .idxmax(1)\n .str.replace(f\"{feat.key}_\", \"\")\n .str.replace(\"_prob\", \"\")\n .values,\n )\n predictions.insert(\n loc=1,\n column=f\"{feat.key}_sd\",\n value=0.0,\n )\n return predictions\n
"},{"location":"ref-utils/#bofire.utils.reduce","title":"reduce
","text":""},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform","title":" AffineTransform
","text":"Class to switch back and forth from the reduced to the original domain.
Source code inbofire/utils/reduce.py
class AffineTransform:\n \"\"\"Class to switch back and forth from the reduced to the original domain.\"\"\"\n\n def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n\n def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n\n def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform.__init__","title":"__init__(self, equalities)
special
","text":"Initializes a AffineTransformation
object.
Parameters:
Name Type Description Defaultequalities
List[Tuple[str,List[str],List[float]]]
List of equalities. Every equality is defined as a tuple, in which the first entry is the key of the reduced feature, the second one is a list of feature keys that can be used to compute the feature and the third list of floats are the corresponding coefficients.
required Source code inbofire/utils/reduce.py
def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n
"},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform.augment_data","title":"augment_data(self, data)
","text":"Restore the eliminated features in a dataframe
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe that should be restored.
requiredReturns:
Type Descriptionpd.DataFrame
Restored dataframe
Source code inbofire/utils/reduce.py
def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n
"},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform.drop_data","title":"drop_data(self, data)
","text":"Drop eliminated features from a dataframe.
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe with features to be dropped.
requiredReturns:
Type Descriptionpd.DataFrame
Reduced dataframe.
Source code inbofire/utils/reduce.py
def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-utils/#bofire.utils.reduce.adjust_boundary","title":"adjust_boundary(feature, coef, rhs)
","text":"Adjusts the boundaries of a feature.
Parameters:
Name Type Description Defaultfeature
ContinuousInput
Feature to be adjusted.
requiredcoef
float
Coefficient.
requiredrhs
float
Right-hand-side of the constraint.
required Source code inbofire/utils/reduce.py
def adjust_boundary(feature: ContinuousInput, coef: float, rhs: float):\n \"\"\"Adjusts the boundaries of a feature.\n\n Args:\n feature (ContinuousInput): Feature to be adjusted.\n coef (float): Coefficient.\n rhs (float): Right-hand-side of the constraint.\n \"\"\"\n boundary = rhs / coef\n if coef > 0:\n if boundary > feature.lower_bound:\n feature.bounds = (boundary, feature.upper_bound)\n else:\n if boundary < feature.upper_bound:\n feature.bounds = (feature.lower_bound, boundary)\n
"},{"location":"ref-utils/#bofire.utils.reduce.check_domain_for_reduction","title":"check_domain_for_reduction(domain)
","text":"Check if the reduction can be applied or if a trivial case is present.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be checked.
requiredReturns:
Type Descriptionbool
True if reducable, else False.
Source code inbofire/utils/reduce.py
def check_domain_for_reduction(domain: Domain) -> bool:\n \"\"\"Check if the reduction can be applied or if a trivial case is present.\n\n Args:\n domain (Domain): Domain to be checked.\n Returns:\n bool: True if reducable, else False.\n \"\"\"\n # are there any constraints?\n if len(domain.constraints) == 0:\n return False\n\n # are there any linear equality constraints?\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n if len(linear_equalities) == 0:\n return False\n\n # are there no NChooseKConstraint constraints?\n if len(domain.constraints.get([NChooseKConstraint])) > 0:\n return False\n\n # are there continuous inputs\n continuous_inputs = domain.inputs.get(ContinuousInput)\n if len(continuous_inputs) == 0:\n return False\n\n # check that equality constraints only contain continuous inputs\n for c in linear_equalities:\n assert isinstance(c, LinearConstraint)\n for feat in c.features:\n if feat not in domain.inputs.get_keys(ContinuousInput):\n return False\n return True\n
"},{"location":"ref-utils/#bofire.utils.reduce.check_existence_of_solution","title":"check_existence_of_solution(A_aug)
","text":"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.
Source code inbofire/utils/reduce.py
def check_existence_of_solution(A_aug):\n \"\"\"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.\"\"\"\n A = A_aug[:, :-1]\n b = A_aug[:, -1]\n len_inputs = np.shape(A)[1]\n\n # catch special cases\n rk_A_aug = np.linalg.matrix_rank(A_aug)\n rk_A = np.linalg.matrix_rank(A)\n\n if rk_A == rk_A_aug:\n if rk_A < len_inputs:\n return # all good\n else:\n x = np.linalg.solve(A, b)\n raise Exception(\n f\"There is a unique solution x for the linear equality constraints: x={x}\"\n )\n elif rk_A < rk_A_aug:\n raise Exception(\n \"There is no solution fulfilling the linear equality constraints.\"\n )\n
"},{"location":"ref-utils/#bofire.utils.reduce.reduce_domain","title":"reduce_domain(domain)
","text":"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be reduced.
requiredReturns:
Type DescriptionTuple[Domain, AffineTransform]
reduced domain and the according transformation to switch between the reduced and orginal domain.
Source code inbofire/utils/reduce.py
def reduce_domain(domain: Domain) -> Tuple[Domain, AffineTransform]:\n \"\"\"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.\n\n Args:\n domain (Domain): Domain to be reduced.\n\n Returns:\n Tuple[Domain, AffineTransform]: reduced domain and the according transformation to switch between the\n reduced and orginal domain.\n \"\"\"\n # check if the domain can be reduced\n if not check_domain_for_reduction(domain):\n return domain, AffineTransform([])\n\n # find linear equality constraints\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n other_constraints = domain.constraints.get(\n Constraint, excludes=[LinearEqualityConstraint]\n )\n\n # only consider continuous inputs\n continuous_inputs = [\n cast(ContinuousInput, f) for f in domain.inputs.get(ContinuousInput)\n ]\n other_inputs = domain.inputs.get(Input, excludes=[ContinuousInput])\n\n # assemble Matrix A from equality constraints\n N = len(linear_equalities)\n M = len(continuous_inputs) + 1\n names = np.concatenate(([feat.key for feat in continuous_inputs], [\"rhs\"]))\n\n A_aug = pd.DataFrame(data=np.zeros(shape=(N, M)), columns=names)\n\n for i in range(len(linear_equalities)):\n c = linear_equalities[i]\n assert isinstance(c, LinearEqualityConstraint)\n A_aug.loc[i, c.features] = c.coefficients # type: ignore\n A_aug.loc[i, \"rhs\"] = c.rhs\n A_aug = A_aug.values\n\n # catch special cases\n check_existence_of_solution(A_aug)\n\n # bring A_aug to reduced row-echelon form\n A_aug_rref, pivots = rref(A_aug)\n pivots = np.array(pivots)\n A_aug_rref = np.array(A_aug_rref).astype(np.float64)\n\n # formulate box bounds as linear inequality constraints in matrix form\n B = np.zeros(shape=(2 * (M - 1), M))\n B[: M - 1, : M - 1] = np.eye(M - 1)\n B[M - 1 :, : M - 1] = -np.eye(M - 1)\n\n B[: M - 1, -1] = np.array([feat.upper_bound for feat in continuous_inputs])\n B[M - 1 :, -1] = -1.0 * np.array([feat.lower_bound for feat in continuous_inputs])\n\n # eliminate columns with pivot element\n for i in range(len(pivots)):\n p = pivots[i]\n B[p, :] -= A_aug_rref[i, :]\n B[p + M - 1, :] += A_aug_rref[i, :]\n\n # build up reduced domain\n _domain = Domain.model_construct(\n # _fields_set = {\"inputs\", \"outputs\", \"constraints\"}\n inputs=deepcopy(other_inputs),\n outputs=deepcopy(domain.outputs),\n constraints=deepcopy(other_constraints),\n )\n new_inputs = [\n deepcopy(feat) for i, feat in enumerate(continuous_inputs) if i not in pivots\n ]\n all_inputs = _domain.inputs + new_inputs\n assert isinstance(all_inputs, Inputs)\n _domain.inputs.features = all_inputs.features\n\n constraints: List[AnyConstraint] = []\n for i in pivots:\n # reduce equation system of upper bounds\n ind = np.where(B[i, :-1] != 0)[0]\n if len(ind) > 0 and B[i, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i, ind]).tolist(),\n rhs=B[i, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(feat, (-1.0 * B[i, ind])[0], B[i, -1] * -1.0)\n else:\n if B[i, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n # reduce equation system of lower bounds\n ind = np.where(B[i + M - 1, :-1] != 0)[0]\n if len(ind) > 0 and B[i + M - 1, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i + M - 1, ind]).tolist(),\n rhs=B[i + M - 1, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(\n feat,\n (-1.0 * B[i + M - 1, ind])[0],\n B[i + M - 1, -1] * -1.0,\n )\n else:\n if B[i + M - 1, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n if len(constraints) > 0:\n _domain.constraints.constraints = _domain.constraints.constraints + constraints # type: ignore\n\n # assemble equalities\n _equalities = []\n for i in range(len(pivots)):\n name_lhs = names[pivots[i]]\n names_rhs = []\n coeffs = []\n\n for j in range(len(names) - 1):\n if A_aug_rref[i, j] != 0 and j != pivots[i]:\n coeffs.append(-A_aug_rref[i, j])\n names_rhs.append(names[j])\n\n coeffs.append(A_aug_rref[i, -1])\n\n _equalities.append((name_lhs, names_rhs, coeffs))\n\n trafo = AffineTransform(_equalities)\n # remove remaining dependencies of eliminated inputs from the problem\n _domain = remove_eliminated_inputs(_domain, trafo)\n return _domain, trafo\n
"},{"location":"ref-utils/#bofire.utils.reduce.remove_eliminated_inputs","title":"remove_eliminated_inputs(domain, transform)
","text":"Eliminates remaining occurences of eliminated inputs in linear constraints.
Parameters:
Name Type Description Defaultdomain
Domain
Domain in which the linear constraints should be purged.
requiredtransform
AffineTransform
Affine transformation object that defines the obsolete features.
requiredExceptions:
Type DescriptionValueError
If feature occurs in a constraint different from a linear one.
Returns:
Type DescriptionDomain
Purged domain.
Source code inbofire/utils/reduce.py
def remove_eliminated_inputs(domain: Domain, transform: AffineTransform) -> Domain:\n \"\"\"Eliminates remaining occurences of eliminated inputs in linear constraints.\n\n Args:\n domain (Domain): Domain in which the linear constraints should be purged.\n transform (AffineTransform): Affine transformation object that defines the obsolete features.\n\n Raises:\n ValueError: If feature occurs in a constraint different from a linear one.\n\n Returns:\n Domain: Purged domain.\n \"\"\"\n inputs_names = domain.inputs.get_keys()\n M = len(inputs_names)\n\n # write the equalities for the backtransformation into one matrix\n inputs_dict = {inputs_names[i]: i for i in range(M)}\n\n # build up dict from domain.equalities e.g. {\"xi1\": [coeff(xj1), ..., coeff(xjn)], ... \"xik\":...}\n coeffs_dict = {}\n for e in transform.equalities:\n coeffs = np.zeros(M + 1)\n for j, name in enumerate(e[1]):\n coeffs[inputs_dict[name]] = e[2][j]\n coeffs[-1] = e[2][-1]\n coeffs_dict[e[0]] = coeffs\n\n constraints = []\n for c in domain.constraints.get():\n # Nonlinear constraints not supported\n if not isinstance(c, LinearConstraint):\n raise ValueError(\n \"Elimination of variables is only supported for LinearEquality and LinearInequality constraints.\"\n )\n\n # no changes, if the constraint does not contain eliminated inputs\n elif all(name in inputs_names for name in c.features):\n constraints.append(c)\n\n # remove inputs from the constraint that were eliminated from the inputs before\n else:\n totally_removed = False\n _features = np.array(inputs_names)\n _rhs = c.rhs\n\n # create new lhs and rhs from the old one and knowledge from problem._equalities\n _coefficients = np.zeros(M)\n for j, name in enumerate(c.features):\n if name in inputs_names:\n _coefficients[inputs_dict[name]] += c.coefficients[j]\n else:\n _coefficients += c.coefficients[j] * coeffs_dict[name][:-1]\n _rhs -= c.coefficients[j] * coeffs_dict[name][-1]\n\n _features = _features[np.abs(_coefficients) > 1e-16]\n _coefficients = _coefficients[np.abs(_coefficients) > 1e-16]\n _c = None\n if isinstance(c, LinearEqualityConstraint):\n if len(_features) > 1:\n _c = LinearEqualityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat: ContinuousInput = ContinuousInput(\n **domain.inputs.get_by_key(_features[0]).model_dump()\n )\n feat.bounds = (_coefficients[0], _coefficients[0])\n totally_removed = True\n else:\n if len(_features) > 1:\n _c = LinearInequalityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat = cast(ContinuousInput, domain.inputs.get_by_key(_features[0]))\n adjust_boundary(feat, _coefficients[0], _rhs)\n totally_removed = True\n\n # check if constraint is always fulfilled/not fulfilled\n if not totally_removed:\n assert _c is not None\n if len(_c.features) == 0 and _c.rhs >= 0:\n pass\n elif len(_c.features) == 0 and _c.rhs < 0:\n raise Exception(\"Linear constraints cannot be fulfilled.\")\n elif np.isinf(_c.rhs):\n pass\n else:\n constraints.append(_c)\n domain.constraints = Constraints(constraints=constraints)\n return domain\n
"},{"location":"ref-utils/#bofire.utils.reduce.rref","title":"rref(A, tol=1e-08)
","text":"Computes the reduced row echelon form of a Matrix
Parameters:
Name Type Description DefaultA
ndarray
2d array representing a matrix.
requiredtol
float
tolerance for rounding to 0. Defaults to 1e-8.
1e-08
Returns:
Type DescriptionTuple[numpy.ndarray, List[int]]
(A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots is a numpy array containing the pivot columns of A_rref
Source code inbofire/utils/reduce.py
def rref(A: np.ndarray, tol: float = 1e-8) -> Tuple[np.ndarray, List[int]]:\n \"\"\"Computes the reduced row echelon form of a Matrix\n\n Args:\n A (ndarray): 2d array representing a matrix.\n tol (float, optional): tolerance for rounding to 0. Defaults to 1e-8.\n\n Returns:\n (A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots\n is a numpy array containing the pivot columns of A_rref\n \"\"\"\n A = np.array(A, dtype=np.float64)\n n, m = np.shape(A)\n\n col = 0\n row = 0\n pivots = []\n\n for col in range(m):\n # does a pivot element exist?\n if all(np.abs(A[row:, col]) < tol):\n pass\n # if yes: start elimination\n else:\n pivots.append(col)\n max_row = np.argmax(np.abs(A[row:, col])) + row\n # switch to most stable row\n A[[row, max_row], :] = A[[max_row, row], :] # type: ignore\n # normalize row\n A[row, :] /= A[row, col]\n # eliminate other elements from column\n for r in range(n):\n if r != row:\n A[r, :] -= A[r, col] / A[row, col] * A[row, :]\n row += 1\n\n prec = int(-np.log10(tol))\n return np.round(A, prec), pivots\n
"},{"location":"ref-utils/#bofire.utils.subdomain","title":"subdomain
","text":""},{"location":"ref-utils/#bofire.utils.subdomain.get_subdomain","title":"get_subdomain(domain, feature_keys)
","text":"removes all features not defined as argument creating a subdomain of the provided domain
Parameters:
Name Type Description Defaultdomain
Domain
the original domain wherefrom a subdomain should be created
requiredfeature_keys
List
List of features that shall be included in the subdomain
requiredExceptions:
Type DescriptionAssert
when in total less than 2 features are provided
ValueError
when a provided feature key is not present in the provided domain
Assert
when no output feature is provided
Assert
when no input feature is provided
ValueError
description
Returns:
Type DescriptionDomain
A new domain containing only parts of the original domain
Source code inbofire/utils/subdomain.py
def get_subdomain(\n domain: Domain,\n feature_keys: List,\n) -> Domain:\n \"\"\"removes all features not defined as argument creating a subdomain of the provided domain\n\n Args:\n domain (Domain): the original domain wherefrom a subdomain should be created\n feature_keys (List): List of features that shall be included in the subdomain\n\n Raises:\n Assert: when in total less than 2 features are provided\n ValueError: when a provided feature key is not present in the provided domain\n Assert: when no output feature is provided\n Assert: when no input feature is provided\n ValueError: _description_\n\n Returns:\n Domain: A new domain containing only parts of the original domain\n \"\"\"\n assert len(feature_keys) >= 2, \"At least two features have to be provided.\"\n outputs = []\n inputs = []\n for key in feature_keys:\n try:\n feat = (domain.inputs + domain.outputs).get_by_key(key)\n except KeyError:\n raise ValueError(f\"Feature {key} not present in domain.\")\n if isinstance(feat, Input):\n inputs.append(feat)\n else:\n outputs.append(feat)\n assert len(outputs) > 0, \"At least one output feature has to be provided.\"\n assert len(inputs) > 0, \"At least one input feature has to be provided.\"\n inputs = Inputs(features=inputs)\n outputs = Outputs(features=outputs)\n # loop over constraints and make sure that all features used in constraints are in the input_feature_keys\n for c in domain.constraints:\n for key in c.features: # type: ignore\n if key not in inputs.get_keys():\n raise ValueError(\n f\"Removed input feature {key} is used in a constraint.\"\n )\n subdomain = deepcopy(domain)\n subdomain.inputs = inputs\n subdomain.outputs = outputs\n return subdomain\n
"},{"location":"ref-utils/#bofire.utils.torch_tools","title":"torch_tools
","text":""},{"location":"ref-utils/#bofire.utils.torch_tools.constrained_objective2botorch","title":"constrained_objective2botorch(idx, objective, eps=1e-08)
","text":"Create a callable that can be used by botorch.utils.objective.apply_constraints
to setup ouput constrained optimizations.
Parameters:
Name Type Description Defaultidx
int
Index of the constraint objective in the list of outputs.
requiredobjective
BotorchConstrainedObjective
The objective that should be transformed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float], int]
List of callables that can be used by botorch for setting up the constrained objective, list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)
Source code inbofire/utils/torch_tools.py
def constrained_objective2botorch(\n idx: int, objective: ConstrainedObjective, eps: float = 1e-8\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float], int]:\n \"\"\"Create a callable that can be used by `botorch.utils.objective.apply_constraints`\n to setup ouput constrained optimizations.\n\n Args:\n idx (int): Index of the constraint objective in the list of outputs.\n objective (BotorchConstrainedObjective): The objective that should be transformed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float], int]: List of callables that can be used by botorch for setting up the constrained objective,\n list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)\n \"\"\"\n assert isinstance(\n objective, ConstrainedObjective\n ), \"Objective is not a `ConstrainedObjective`.\"\n if isinstance(objective, MaximizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp) * -1.0],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, MinimizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp)],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, TargetObjective):\n return (\n [\n lambda Z: (Z[..., idx] - (objective.target_value - objective.tolerance))\n * -1.0,\n lambda Z: (\n Z[..., idx] - (objective.target_value + objective.tolerance)\n ),\n ],\n [1.0 / objective.steepness, 1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, ConstrainedCategoricalObjective):\n # The output of a categorical objective has final dim `c` where `c` is number of classes\n # Pass in the expected acceptance probability and perform an inverse sigmoid to atain the original probabilities\n return (\n [\n lambda Z: torch.log(\n 1\n / torch.clamp(\n (\n Z[..., idx : idx + len(objective.desirability)]\n * torch.tensor(objective.desirability).to(**tkwargs)\n ).sum(-1),\n min=eps,\n max=1 - eps,\n )\n - 1,\n )\n ],\n [1.0],\n idx + len(objective.desirability),\n )\n else:\n raise ValueError(f\"Objective {objective.__class__.__name__} not known.\")\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_initial_conditions_generator","title":"get_initial_conditions_generator(strategy, transform_specs, ask_options=None, sequential=True)
","text":"Takes a strategy object and returns a callable which uses this strategy to return a generator callable which can be used in botorchs
gen_batch_initial_conditions` to generate samples.
Parameters:
Name Type Description Defaultstrategy
Strategy
Strategy that should be used to generate samples.
requiredtransform_specs
Dict
Dictionary indicating how the samples should be transformed.
requiredask_options
Dict
Dictionary of keyword arguments that are passed to the ask
method of the strategy. Defaults to {}.
None
sequential
bool
If True, samples for every q-batch are generate indepenent from each other. If False, the n x q
samples are generated at once.
True
Returns:
Type DescriptionCallable[[int, int, int], Tensor]
Callable that can be passed to batch_initial_conditions
.
bofire/utils/torch_tools.py
def get_initial_conditions_generator(\n strategy: Strategy,\n transform_specs: Dict,\n ask_options: Optional[Dict] = None,\n sequential: bool = True,\n) -> Callable[[int, int, int], Tensor]:\n \"\"\"Takes a strategy object and returns a callable which uses this\n strategy to return a generator callable which can be used in botorch`s\n `gen_batch_initial_conditions` to generate samples.\n\n Args:\n strategy (Strategy): Strategy that should be used to generate samples.\n transform_specs (Dict): Dictionary indicating how the samples should be\n transformed.\n ask_options (Dict, optional): Dictionary of keyword arguments that are\n passed to the `ask` method of the strategy. Defaults to {}.\n sequential (bool, optional): If True, samples for every q-batch are\n generate indepenent from each other. If False, the `n x q` samples\n are generated at once.\n\n Returns:\n Callable[[int, int, int], Tensor]: Callable that can be passed to\n `batch_initial_conditions`.\n \"\"\"\n if ask_options is None:\n ask_options = {}\n\n def generator(n: int, q: int, seed: int) -> Tensor:\n if sequential:\n initial_conditions = []\n for _ in range(n):\n candidates = strategy.ask(q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n # transform to tensor\n initial_conditions.append(\n torch.from_numpy(transformed_candidates.values).to(**tkwargs)\n )\n return torch.stack(initial_conditions, dim=0)\n else:\n candidates = strategy.ask(n * q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n return (\n torch.from_numpy(transformed_candidates.values)\n .to(**tkwargs)\n .reshape(n, q, transformed_candidates.shape[1])\n )\n\n return generator\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_interpoint_constraints","title":"get_interpoint_constraints(domain, n_candidates)
","text":"Converts interpoint equality constraints to linear equality constraints, that can be processed by botorch. For more information, see the docstring of optimize_acqf
in botorch (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredn_candidates
int
Number of candidates that should be requested.
requiredReturns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_interpoint_constraints(\n domain: Domain, n_candidates: int\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts interpoint equality constraints to linear equality constraints,\n that can be processed by botorch. For more information, see the docstring\n of `optimize_acqf` in botorch\n (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).\n\n Args:\n domain (Domain): Optimization problem definition.\n n_candidates (int): Number of candidates that should be requested.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists\n of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for constraint in domain.constraints.get(InterpointEqualityConstraint):\n assert isinstance(constraint, InterpointEqualityConstraint)\n coefficients = torch.tensor([1.0, -1.0]).to(**tkwargs)\n feat_idx = domain.inputs.get_keys(Input).index(constraint.feature)\n feat = domain.inputs.get_by_key(constraint.feature)\n assert isinstance(feat, ContinuousInput)\n if feat.is_fixed():\n continue\n multiplicity = constraint.multiplicity or n_candidates\n for i in range(math.ceil(n_candidates / multiplicity)):\n all_indices = torch.arange(\n i * multiplicity, min((i + 1) * multiplicity, n_candidates)\n )\n for k in range(len(all_indices) - 1):\n indices = torch.tensor(\n [[all_indices[0], feat_idx], [all_indices[k + 1], feat_idx]],\n dtype=torch.int64,\n )\n constraints.append((indices, coefficients, 0.0))\n return constraints\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_linear_constraints","title":"get_linear_constraints(domain, constraint, unit_scaled=False)
","text":"Converts linear constraints to the form required by BoTorch.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredconstraint
Union[Type[bofire.data_models.constraints.linear.LinearEqualityConstraint], Type[bofire.data_models.constraints.linear.LinearInequalityConstraint]]
Type of constraint that should be converted.
requiredunit_scaled
bool
If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.
False
Returns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_linear_constraints(\n domain: Domain,\n constraint: Union[Type[LinearEqualityConstraint], Type[LinearInequalityConstraint]],\n unit_scaled: bool = False,\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts linear constraints to the form required by BoTorch.\n\n Args:\n domain: Optimization problem definition.\n constraint: Type of constraint that should be converted.\n unit_scaled: If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for c in domain.constraints.get(constraint):\n indices = []\n coefficients = []\n lower = []\n upper = []\n rhs = 0.0\n for i, featkey in enumerate(c.features): # type: ignore\n idx = domain.inputs.get_keys(Input).index(featkey)\n feat = domain.inputs.get_by_key(featkey)\n if feat.is_fixed(): # type: ignore\n rhs -= feat.fixed_value()[0] * c.coefficients[i] # type: ignore\n else:\n lower.append(feat.lower_bound) # type: ignore\n upper.append(feat.upper_bound) # type: ignore\n indices.append(idx)\n coefficients.append(\n c.coefficients[i] # type: ignore\n ) # if unit_scaled == False else c_scaled.coefficients[i])\n if unit_scaled:\n lower = np.array(lower)\n upper = np.array(upper)\n s = upper - lower\n scaled_coefficients = s * np.array(coefficients)\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(scaled_coefficients).to(**tkwargs),\n -(rhs + c.rhs - np.sum(np.array(coefficients) * lower)), # type: ignore\n )\n )\n else:\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(coefficients).to(**tkwargs),\n -(rhs + c.rhs), # type: ignore\n )\n )\n return constraints\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_multiobjective_objective","title":"get_multiobjective_objective(outputs)
","text":"Returns
Parameters:
Name Type Description Defaultoutputs
Outputs
description
requiredReturns:
Type DescriptionCallable[[Tensor], Tensor]
description
Source code inbofire/utils/torch_tools.py
def get_multiobjective_objective(\n outputs: Outputs,\n) -> Callable[[Tensor, Optional[Tensor]], Tensor]:\n \"\"\"Returns\n\n Args:\n outputs (Outputs): _description_\n\n Returns:\n Callable[[Tensor], Tensor]: _description_\n \"\"\"\n callables = [\n get_objective_callable(idx=i, objective=feat.objective) # type: ignore\n for i, feat in enumerate(outputs.get())\n if feat.objective is not None # type: ignore\n and isinstance(\n feat.objective, # type: ignore\n (MaximizeObjective, MinimizeObjective, CloseToTargetObjective),\n )\n ]\n\n def objective(samples: Tensor, X: Optional[Tensor] = None) -> Tensor:\n return torch.stack([c(samples, None) for c in callables], dim=-1)\n\n return objective\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_nchoosek_constraints","title":"get_nchoosek_constraints(domain)
","text":"Transforms NChooseK constraints into a list of non-linear inequality constraint callables that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered at zero.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
List of callables that can be used as nonlinear equality constraints in botorch.
Source code inbofire/utils/torch_tools.py
def get_nchoosek_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"Transforms NChooseK constraints into a list of non-linear inequality constraint callables\n that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously\n relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered\n at zero.\n\n Args:\n domain (Domain): Optimization problem definition.\n\n Returns:\n List[Callable[[Tensor], float]]: List of callables that can be used\n as nonlinear equality constraints in botorch.\n \"\"\"\n\n def narrow_gaussian(x, ell=1e-3):\n return torch.exp(-0.5 * (x / ell) ** 2)\n\n def max_constraint(indices: Tensor, num_features: int, max_count: int):\n return lambda x: narrow_gaussian(x=x[..., indices]).sum(dim=-1) - (\n num_features - max_count\n )\n\n def min_constraint(indices: Tensor, num_features: int, min_count: int):\n return lambda x: -narrow_gaussian(x=x[..., indices]).sum(dim=-1) + (\n num_features - min_count\n )\n\n constraints = []\n # ignore none also valid for the start\n for c in domain.constraints.get(NChooseKConstraint):\n assert isinstance(c, NChooseKConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n if c.max_count != len(c.features):\n constraints.append(\n max_constraint(\n indices=indices, num_features=len(c.features), max_count=c.max_count\n )\n )\n if c.min_count > 0:\n constraints.append(\n min_constraint(\n indices=indices, num_features=len(c.features), min_count=c.min_count\n )\n )\n return constraints\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_nonlinear_constraints","title":"get_nonlinear_constraints(domain)
","text":"Returns a list of callable functions that represent the nonlinear constraints for the given domain that can be processed by botorch.
Parameters:
Name Type Description Defaultdomain
Domain
The domain for which to generate the nonlinear constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of callable functions that take a tensor as input and return a float value representing the constraint evaluation.
Source code inbofire/utils/torch_tools.py
def get_nonlinear_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of callable functions that represent the nonlinear constraints\n for the given domain that can be processed by botorch.\n\n Parameters:\n domain (Domain): The domain for which to generate the nonlinear constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of callable functions that take a tensor\n as input and return a float value representing the constraint evaluation.\n \"\"\"\n return get_nchoosek_constraints(domain) + get_product_constraints(domain)\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_output_constraints","title":"get_output_constraints(outputs)
","text":"Method to translate output constraint objectives into a list of callables and list of etas for use in botorch.
Parameters:
Name Type Description Defaultoutputs
Outputs
Output feature object that should be processed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float]]
List of constraint callables, list of associated etas.
Source code inbofire/utils/torch_tools.py
def get_output_constraints(\n outputs: Outputs,\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float]]:\n \"\"\"Method to translate output constraint objectives into a list of\n callables and list of etas for use in botorch.\n\n Args:\n outputs (Outputs): Output feature object that should\n be processed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float]]: List of constraint callables,\n list of associated etas.\n \"\"\"\n constraints = []\n etas = []\n idx = 0\n for feat in outputs.get():\n if isinstance(feat.objective, ConstrainedObjective): # type: ignore\n iconstraints, ietas, idx = constrained_objective2botorch(\n idx,\n objective=feat.objective, # type: ignore\n )\n constraints += iconstraints\n etas += ietas\n else:\n idx += 1\n return constraints, etas\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_product_constraints","title":"get_product_constraints(domain)
","text":"Returns a list of nonlinear constraint functions that can be processed by botorch based on the given domain.
Parameters:
Name Type Description Defaultdomain
Domain
The domain object containing the constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of product constraint functions.
Source code inbofire/utils/torch_tools.py
def get_product_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of nonlinear constraint functions that can be processed by botorch\n based on the given domain.\n\n Args:\n domain (Domain): The domain object containing the constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of product constraint functions.\n\n \"\"\"\n\n def product_constraint(indices: Tensor, exponents: Tensor, rhs: float, sign: int):\n return lambda x: -1.0 * sign * (x[..., indices] ** exponents).prod(dim=-1) + rhs\n\n constraints = []\n for c in domain.constraints.get(ProductInequalityConstraint):\n assert isinstance(c, ProductInequalityConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n constraints.append(\n product_constraint(indices, torch.tensor(c.exponents), c.rhs, c.sign)\n )\n return constraints\n
"},{"location":"userguide_surrogates/","title":"Surrogate models","text":"In Bayesian Optimization, information from previous experiments is taken into account to generate proposals for future experiments. This information is leveraged by creating a surrogate model for the black-box function that is to be optimized based on the available data. Naturally, experimental candidates for which the surrogate model makes a promising prediction (e.g., high predicted values of a quantity we want to maximize) should be chosen over ones for which this is not the case. However, since the available data might cover only a small part of the input space, the model is likely to only be able to make very uncertain predictions far away from the data. Therefore, the surrogate model should be able to express the degree to which the predictions are uncertain so that we can use this information - combining the prediction and the associated uncertainty - to select the settings for the next experimental iteration.
The acquisition function is the object that turns the predicted distribution (you can think of this as the prediction and the prediction uncertainty) into a single quantity representing how promising a candidate experimental point seems. This function determines if one rather wants to focus on exploitation, i.e., quickly approaching a close local optimum of the black-box function, or on exploration, i.e., exploring different regions of the input space first.
Therefore, three criteria typically determine whether any candidate is selected as experimental proposal: the value of the surrogate model, the uncertainty of the model, and the acquisition function.
"},{"location":"userguide_surrogates/#surrogate-model-options","title":"Surrogate model options","text":"BoFire offers the following classes of surrogate models.
Surrogate Optimization of When to use Type SingleTaskGPSurrogate a single objective with real valued inputs Limited data and black-box function is smooth Gaussian process RandomForestSurrogate a single objective Rich data; black-box function does not have to be smooth sklearn random forest implementation MLP a single objective with real-valued inputs Rich data and black-box function is smooth Multi layer perceptron MixedSingleTaskGPSurrogate a single objective with categorical and real valued inputs Limited data and black-box function is smooth Gaussian process XGBoostSurrogate a single objective Rich data; black-box function does not have to be smooth xgboost implementation of gradient boosting trees TanimotoGP a single objective At least one input feature is a molecule represented as fingerprint Gaussian process on a molecule space for which Tanimoto similarity determines the similarity between pointsAll of these are single-objective surrogate models. For optimization of multiple objectives at the same time, a suitable Strategy has to be chosen. Then for each objective a different surrogate model can be specified. By default the SingleTaskGPSurrogate is used.
Example:
surrogate_data_0 = SingleTaskGPSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[0]]),\n)\nsurrogate_data_1 = XGBoostSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[1]]),\n)\nqparego_data_model = QparegoStrategy(\n domain=domain,\n surrogate_specs=BotorchSurrogates(\n surrogates=[surrogate_data_0, surrogate_data_1]\n ),\n)\n
Note:
BoFire also offers the option to customize surrogate models. In particular, it is possible to customize the SingleTaskGPSurrogate in the following ways.
"},{"location":"userguide_surrogates/#kernel-customization","title":"Kernel customization","text":"Specify the Kernel:
Kernel Description Translation invariant Input variable type RBFKernel Based on Gaussian distribution Yes Continuous MaternKernel Based on Gamma function; allows setting a smoothness parameter Yes Continuous PolynomialKernel Based on dot-product of two vectors of input points No Continuous LinearKernel Equal to dot-product of two vectors of input points No Continuous TanimotoKernel Measures similarities between binary vectors using Tanimoto Similiarity Not applicable MolecularInput HammingDistanceKernel Similarity is defined by the Hamming distance which considers the number of equal entries between two vectors (e.g., in One-Hot-encoding) Not applicable CategoricalTranslational invariance means that the similarity between two input points is not affected by shifting both points by the same amount but only determined by their distance. Example: with a translationally invariant kernel, the values 10 and 20 are equally similar to each other as the values 20 and 30, while with a polynomial kernel the latter pair has potentially higher similarity. Polynomial kernels are often suitable for high-dimensional inputs while for low-dimensional inputs an RBF or Mat\u00e9rn kernel is recommended.
Note: - SingleTaskGPSurrogate with PolynomialKernel is equivalent to PolynomialSurrogate. - SingleTaskGPSurrogate with LinearKernel is equivalent to LinearSurrogate. - SingleTaskGPSurrogate with TanimotoKernel is equivalent to TanimotoGP. - One can combine two Kernels by using AdditiveKernel or MultiplicativeKernel.
Example:
surrogate_data_0 = SingleTaskGPSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[0]]),\n kernel=PolynomialKernel(power=2)\n)\n
"},{"location":"userguide_surrogates/#noise-model-customization","title":"Noise model customization","text":"For experimental data subject to noise, one can specify the distribution of this noise. The options are:
Noise Model When to use NormalPrior Noise is Gaussian GammaPrior Noise has a Gamma distributionExample:
surrogate_data_0 = SingleTaskGPSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[0]]),\n kernel=PolynomialKernel(power=2),\n noise_prior=NormalPrior(loc=0, scale=1)\n)\n
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":""},{"location":"#introduction","title":"Introduction","text":"BoFire is a framework to define and solve black-box optimization problems. These problems can arise in a number of closely related fields including experimental design, multi-objective optimization and active learning.
BoFire problem specifications are json serializable for use in RESTful APIs and are to a large extent agnostic to the specific methods and frameworks in which the problems are solved.
You can find code-examples in the Getting Started section of this document, as well as full worked-out examples of code-usage in the /tutorials section of this repository!
"},{"location":"#experimental-design","title":"Experimental design","text":"In the context of experimental design BoFire allows to define a design space
\\[ \\mathbb{X} = x_1 \\otimes x_2 \\ldots \\otimes x_D \\]where the design parameters may take values depending on their type and domain, e.g.
and a set of equations define additional experimental constraints, e.g.
In the context of multi-objective optimization BoFire allows to define a vector-valued optimization problem
\\[ \\min_{x \\in \\mathbb{X}} s(y(x)) \\]where
Since the objectives are in general conflicting, there is no point \\(x\\) that simultaneously optimizes all objectives. Instead the goal is to find the Pareto front of all optimal compromises.
A decision maker can then explore these compromises to get a deep understanding of the problem and make the best informed decision.
"},{"location":"#bayesian-optimization","title":"Bayesian optimization","text":"In the context of Bayesian optimization we want to simultaneously learn the unknown function \\(y(x)\\) (exploration), while focusing the experimental effort on promising regions (exploitation). This is done by using the experimental data to fit a probabilistic model \\(p(y|x, {data})\\) that estimates the distribution of possible outcomes for \\(y\\). An acquisition function \\(a\\) then formulates the desired trade-off between exploration and exploitation
\\[ \\min_{x \\in \\mathbb{X}} a(s(p_y(x))) \\]and the minimizer \\(x_\\mathrm{opt}\\) of this acquisition function determines the next experiment \\(y(x)\\) to run.
When there are multiple competing objectives, the task is again to find a suitable approximation of the Pareto front.
"},{"location":"#design-of-experiments","title":"Design of Experiments","text":"BoFire can be used to generate optimal experimental designs with respect to various optimality criteria like D-optimality, A-optimality or uniform space filling.
For this, the user specifies a design space and a model formula, then chooses an optimality criterion and the desired number of experiments in the design. The resulting optimization problem is then solved by IPOPT.
The doe subpackage also supports a wide range of constraints on the design space including linear and nonlinear equalities and inequalities as well a (limited) use of NChooseK constraints. The user can provide fixed experiments that will be treated as part of the design but remain fixed during the optimization process. While some of the optimization algorithms support non-continuous design variables, the doe subpackage only supports those that are continuous.
By default IPOPT uses the freely available linear solver MUMPS. For large models choosing a different linear solver (e.g. ma57 from Coin-HSL) can vastly reduce optimization time. A free academic license for Coin-HSL can be obtained here. Instructions on how to install additional linear solvers for IPOPT are given in the IPOPT documentation. For choosing a specific (HSL) linear solver in BoFire you can just pass the name of the solver to find_local_max_ipopt()
with the linear_solver
option together with the library's name in the option hsllib
, e.g.
find_local_max_ipopt(domain, \"fully-quadratic\", ipopt_options={\"linear_solver\":\"ma57\", \"hsllib\":\"libcoinhsl.so\"})\n
"},{"location":"basic_examples/","title":"Basic Examples for the DoE Subpackage","text":"In\u00a0[11]: Copied! import numpy as np\nimport matplotlib.pyplot as plt\nfrom matplotlib.ticker import FormatStrFormatter\n\nfrom bofire.data_models.constraints.api import (\n NonlinearEqualityConstraint,\n NonlinearInequalityConstraint,\n LinearEqualityConstraint,\n LinearInequalityConstraint,\n InterpointEqualityConstraint,\n)\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.features.api import ContinuousInput, ContinuousOutput\nfrom bofire.strategies.doe.design import find_local_max_ipopt\nimport numpy as np import matplotlib.pyplot as plt from matplotlib.ticker import FormatStrFormatter from bofire.data_models.constraints.api import ( NonlinearEqualityConstraint, NonlinearInequalityConstraint, LinearEqualityConstraint, LinearInequalityConstraint, InterpointEqualityConstraint, ) from bofire.data_models.domain.api import Domain from bofire.data_models.features.api import ContinuousInput, ContinuousOutput from bofire.strategies.doe.design import find_local_max_ipopt In\u00a0[12]: Copied!
domain = Domain(\n inputs = [\n ContinuousInput(key=\"x1\", bounds = (0,1)),\n ContinuousInput(key=\"x2\", bounds = (0.1, 1)),\n ContinuousInput(key=\"x3\", bounds = (0, 0.6))\n ],\n outputs = [ContinuousOutput(key=\"y\")],\n constraints = [\n LinearEqualityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=1),\n LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[5,4], rhs=3.9),\n LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[-20,5], rhs=-3)\n ]\n)\n\nd_optimal_design = find_local_max_ipopt(domain, \"linear\", n_experiments=12, ipopt_options={\"disp\":0}).to_numpy().T\ndomain = Domain( inputs = [ ContinuousInput(key=\"x1\", bounds = (0,1)), ContinuousInput(key=\"x2\", bounds = (0.1, 1)), ContinuousInput(key=\"x3\", bounds = (0, 0.6)) ], outputs = [ContinuousOutput(key=\"y\")], constraints = [ LinearEqualityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=1), LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[5,4], rhs=3.9), LinearInequalityConstraint(features=[\"x1\",\"x2\"], coefficients=[-20,5], rhs=-3) ] ) d_optimal_design = find_local_max_ipopt(domain, \"linear\", n_experiments=12, ipopt_options={\"disp\":0}).to_numpy().T In\u00a0[13]: Copied!
fig = plt.figure(figsize=((10,10)))\nax = fig.add_subplot(111, projection='3d')\nax.view_init(45, 45)\nax.set_title(\"Linear model\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\nplt.rcParams[\"figure.figsize\"] = (10,8)\n\n#plot feasible polytope\nax.plot(\n xs=[7/10, 3/10, 1/5, 3/10, 7/10],\n ys=[1/10, 3/5, 1/5, 1/10, 1/10],\n zs=[1/5, 1/10, 3/5, 3/5, 1/5],\n linewidth=2\n)\n\n#plot D-optimal solutions\nax.scatter(\n xs=d_optimal_design[0],\n ys=d_optimal_design[1],\n zs=d_optimal_design[2],\n marker=\"o\",\n s=40,\n color=\"orange\",\n label=\"optimal_design solution, 12 points\"\n)\n\nplt.legend()\nfig = plt.figure(figsize=((10,10))) ax = fig.add_subplot(111, projection='3d') ax.view_init(45, 45) ax.set_title(\"Linear model\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") plt.rcParams[\"figure.figsize\"] = (10,8) #plot feasible polytope ax.plot( xs=[7/10, 3/10, 1/5, 3/10, 7/10], ys=[1/10, 3/5, 1/5, 1/10, 1/10], zs=[1/5, 1/10, 3/5, 3/5, 1/5], linewidth=2 ) #plot D-optimal solutions ax.scatter( xs=d_optimal_design[0], ys=d_optimal_design[1], zs=d_optimal_design[2], marker=\"o\", s=40, color=\"orange\", label=\"optimal_design solution, 12 points\" ) plt.legend() Out[13]:
<matplotlib.legend.Legend at 0x2920b6bd0>In\u00a0[14]: Copied!
d_optimal_design = find_local_max_ipopt(domain, \"x1 + x2 + x3 + {x1**2} + {x2**2} + {x3**2} + {x1**3} + {x2**3} + {x3**3} + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3\", n_experiments=12).to_numpy().T\n\nd_opt = np.array([\n [0.7, 0.3, 0.2, 0.3, 0.5902, 0.4098, 0.2702, 0.2279, 0.4118, 0.5738, 0.4211, 0.3360],\n [0.1, 0.6, 0.2, 0.1, 0.2373, 0.4628, 0.4808, 0.3117, 0.1, 0.1, 0.2911, 0.2264],\n [0.2, 0.1, 0.6, 0.6, 0.1725, 0.1274, 0.249, 0.4604, 0.4882, 0.3262, 0.2878, 0.4376],\n]) # values taken from paper\n\n\nfig = plt.figure(figsize=((10,10)))\nax = fig.add_subplot(111, projection='3d')\nax.set_title(\"cubic model\")\nax.view_init(45, 45)\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\nplt.rcParams[\"figure.figsize\"] = (10,8)\n\n#plot feasible polytope\nax.plot(\n xs=[7/10, 3/10, 1/5, 3/10, 7/10],\n ys=[1/10, 3/5, 1/5, 1/10, 1/10],\n zs=[1/5, 1/10, 3/5, 3/5, 1/5],\n linewidth=2\n)\n\n#plot D-optimal solution\nax.scatter(\n xs=d_opt[0],\n ys=d_opt[1],\n zs=d_opt[2],\n marker=\"o\",\n s=40,\n color=\"darkgreen\",\n label=\"D-optimal design, 12 points\"\n)\n\nax.scatter(\n xs=d_optimal_design[0],\n ys=d_optimal_design[1],\n zs=d_optimal_design[2],\n marker=\"o\",\n s=40,\n color=\"orange\",\n label=\"optimal_design solution, 12 points\"\n)\n\nplt.legend()\nd_optimal_design = find_local_max_ipopt(domain, \"x1 + x2 + x3 + {x1**2} + {x2**2} + {x3**2} + {x1**3} + {x2**3} + {x3**3} + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3\", n_experiments=12).to_numpy().T d_opt = np.array([ [0.7, 0.3, 0.2, 0.3, 0.5902, 0.4098, 0.2702, 0.2279, 0.4118, 0.5738, 0.4211, 0.3360], [0.1, 0.6, 0.2, 0.1, 0.2373, 0.4628, 0.4808, 0.3117, 0.1, 0.1, 0.2911, 0.2264], [0.2, 0.1, 0.6, 0.6, 0.1725, 0.1274, 0.249, 0.4604, 0.4882, 0.3262, 0.2878, 0.4376], ]) # values taken from paper fig = plt.figure(figsize=((10,10))) ax = fig.add_subplot(111, projection='3d') ax.set_title(\"cubic model\") ax.view_init(45, 45) ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") plt.rcParams[\"figure.figsize\"] = (10,8) #plot feasible polytope ax.plot( xs=[7/10, 3/10, 1/5, 3/10, 7/10], ys=[1/10, 3/5, 1/5, 1/10, 1/10], zs=[1/5, 1/10, 3/5, 3/5, 1/5], linewidth=2 ) #plot D-optimal solution ax.scatter( xs=d_opt[0], ys=d_opt[1], zs=d_opt[2], marker=\"o\", s=40, color=\"darkgreen\", label=\"D-optimal design, 12 points\" ) ax.scatter( xs=d_optimal_design[0], ys=d_optimal_design[1], zs=d_optimal_design[2], marker=\"o\", s=40, color=\"orange\", label=\"optimal_design solution, 12 points\" ) plt.legend()
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:668: UserWarning: The minimum number of experiments is 17, but the current setting is n_experiments=12.\n warnings.warn(\nOut[14]:
<matplotlib.legend.Legend at 0x29200b010>In\u00a0[15]: Copied!
def plot_results_3d(result, surface_func):\n u, v = np.mgrid[0 : 2 * np.pi : 100j, 0 : np.pi : 80j]\n X = np.cos(u) * np.sin(v)\n Y = np.sin(u) * np.sin(v)\n Z = surface_func(X, Y)\n\n fig = plt.figure(figsize=(8, 8))\n ax = fig.add_subplot(111, projection=\"3d\")\n ax.plot_surface(X, Y, Z, alpha=0.3)\n ax.scatter(\n xs=result[\"x1\"],\n ys=result[\"x2\"],\n zs=result[\"x3\"],\n marker=\"o\",\n s=40,\n color=\"red\",\n )\n ax.set(xlabel=\"x1\", ylabel=\"x2\", zlabel=\"x3\")\n ax.xaxis.set_major_formatter(FormatStrFormatter('%.2f'))\n ax.yaxis.set_major_formatter(FormatStrFormatter('%.2f'))\ndef plot_results_3d(result, surface_func): u, v = np.mgrid[0 : 2 * np.pi : 100j, 0 : np.pi : 80j] X = np.cos(u) * np.sin(v) Y = np.sin(u) * np.sin(v) Z = surface_func(X, Y) fig = plt.figure(figsize=(8, 8)) ax = fig.add_subplot(111, projection=\"3d\") ax.plot_surface(X, Y, Z, alpha=0.3) ax.scatter( xs=result[\"x1\"], ys=result[\"x2\"], zs=result[\"x3\"], marker=\"o\", s=40, color=\"red\", ) ax.set(xlabel=\"x1\", ylabel=\"x2\", zlabel=\"x3\") ax.xaxis.set_major_formatter(FormatStrFormatter('%.2f')) ax.yaxis.set_major_formatter(FormatStrFormatter('%.2f')) In\u00a0[16]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (-1,1)),\n ContinuousInput(key=\"x2\", bounds = (-1,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[NonlinearInequalityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])],\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100, \"disp\":0})\nresult.round(3)\nplot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (-1,1)), ContinuousInput(key=\"x2\", bounds = (-1,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[NonlinearInequalityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])], ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100, \"disp\":0}) result.round(3) plot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:408: UserWarning: Nonlinear constraints were detected. Not all features and checks are supported for this type of constraints. Using them can lead to unexpected behaviour. Please make sure to provide jacobians for nonlinear constraints.\n warnings.warn(\n/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:440: UserWarning: Sampling failed. Falling back to uniform sampling on input domain. Providing a custom sampling strategy compatible with the problem can possibly improve performance.\n warnings.warn(\n
And the same for a design space limited by an elliptical cone $x_1^2 + x_2^2 - x_3 \\leq 0$.
In\u00a0[17]: Copied!domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (-1,1)),\n ContinuousInput(key=\"x2\", bounds = (-1,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[NonlinearInequalityConstraint(expression=\"x1**2 + x2**2 - x3\", features=[\"x1\",\"x2\",\"x3\"])],\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100})\nresult.round(3)\nplot_results_3d(result, surface_func=lambda x1, x2: x1**2 + x2**2)\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (-1,1)), ContinuousInput(key=\"x2\", bounds = (-1,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[NonlinearInequalityConstraint(expression=\"x1**2 + x2**2 - x3\", features=[\"x1\",\"x2\",\"x3\"])], ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}) result.round(3) plot_results_3d(result, surface_func=lambda x1, x2: x1**2 + x2**2)
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:408: UserWarning: Nonlinear constraints were detected. Not all features and checks are supported for this type of constraints. Using them can lead to unexpected behaviour. Please make sure to provide jacobians for nonlinear constraints.\n warnings.warn(\n/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:440: UserWarning: Sampling failed. Falling back to uniform sampling on input domain. Providing a custom sampling strategy compatible with the problem can possibly improve performance.\n warnings.warn(\nIn\u00a0[18]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (-1,1)),\n ContinuousInput(key=\"x2\", bounds = (-1,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[NonlinearEqualityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])],\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100})\nresult.round(3)\nplot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (-1,1)), ContinuousInput(key=\"x2\", bounds = (-1,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[NonlinearEqualityConstraint(expression=\"(x1**2 + x2**2)**0.5 - x3\", features=[\"x1\",\"x2\",\"x3\"])], ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}) result.round(3) plot_results_3d(result, surface_func=lambda x1, x2: np.sqrt(x1**2 + x2**2))
/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:408: UserWarning: Nonlinear constraints were detected. Not all features and checks are supported for this type of constraints. Using them can lead to unexpected behaviour. Please make sure to provide jacobians for nonlinear constraints.\n warnings.warn(\n/Users/aaron/Documents/bofire/bofire/strategies/doe/design.py:440: UserWarning: Sampling failed. Falling back to uniform sampling on input domain. Providing a custom sampling strategy compatible with the problem can possibly improve performance.\n warnings.warn(\nIn\u00a0[19]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"x1\", bounds = (0,1)),\n ContinuousInput(key=\"x2\", bounds = (0,1)),\n ContinuousInput(key=\"x3\", bounds = (0,1))],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[InterpointEqualityConstraint(feature=\"x1\", multiplicity=3)]\n)\n\nresult = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}, n_experiments=12)\nresult.round(3)\ndomain = Domain( inputs=[ ContinuousInput(key=\"x1\", bounds = (0,1)), ContinuousInput(key=\"x2\", bounds = (0,1)), ContinuousInput(key=\"x3\", bounds = (0,1))], outputs=[ContinuousOutput(key=\"y\")], constraints=[InterpointEqualityConstraint(feature=\"x1\", multiplicity=3)] ) result = find_local_max_ipopt(domain, \"linear\", ipopt_options={\"maxiter\": 100}, n_experiments=12) result.round(3) Out[19]: x1 x2 x3 exp0 1.0 1.0 1.0 exp1 1.0 1.0 1.0 exp2 1.0 -0.0 -0.0 exp3 -0.0 1.0 1.0 exp4 -0.0 -0.0 -0.0 exp5 -0.0 -0.0 -0.0 exp6 -0.0 1.0 -0.0 exp7 -0.0 -0.0 1.0 exp8 -0.0 -0.0 1.0 exp9 -0.0 -0.0 1.0 exp10 -0.0 1.0 -0.0"},{"location":"basic_examples/#basic-examples-for-the-doe-subpackage","title":"Basic Examples for the DoE Subpackage\u00b6","text":"
The following example has been taken from the paper \"The construction of D- and I-optimal designs for mixture experiments with linear constraints on the components\" by R. Coetzer and L. M. Haines.
"},{"location":"basic_examples/#linear-model","title":"linear model\u00b6","text":""},{"location":"basic_examples/#cubic-model","title":"cubic model\u00b6","text":""},{"location":"basic_examples/#nonlinear-constraints","title":"Nonlinear Constraints\u00b6","text":"IPOPT also supports nonlinear constraints. This notebook shows examples of design optimizations with nonlinear constraints.
"},{"location":"basic_examples/#example-1-design-inside-a-cone-nonlinear-inequality","title":"Example 1: Design inside a cone / nonlinear inequality\u00b6","text":"In the following example we have three design variables. We impose the constraint of all experiments to be contained in the interior of a cone, which corresponds the nonlinear inequality constraint $\\sqrt{x_1^2 + x_2^2} - x_3 \\leq 0$. The optimization is done for a linear model and places the points on the surface of the cone so as to maximize the between them
"},{"location":"basic_examples/#example-2-design-on-the-surface-of-a-cone-nonlinear-equality","title":"Example 2: Design on the surface of a cone / nonlinear equality\u00b6","text":"We can also limit the design space to the surface of a cone, defined by the equality constraint $\\sqrt{x_1^2 + x_2^2} - x_3 = 0$
Note that due to missing sampling methods in opti, the initial points provided to IPOPT don't satisfy the constraints.
"},{"location":"basic_examples/#example-3-batch-constraints","title":"Example 3: Batch constraints\u00b6","text":"Batch constraints can be used to create designs where each set of multiplicity
subsequent experiments have the same value for a certain feature. In the following example we fix the value of the decision variable x1
inside each batch of size 3.
Data models in BoFire hold static data of an optimization problem. These are input and output features as well as constraints making up the domain. They further include possible optimization objectives, acquisition functions, and kernels.
All data models in bofire.data_models
, are specified as pydantic models and inherit from bofire.data_models.base.BaseModel
. These data models can be (de)serialized via .dict()
and .json()
(provided by pydantic). A json schema of each data model can be obtained using .schema()
.
For surrogates and strategies, all functional parts are located in bofire.surrogates
and bofire.strategies
. These functionalities include the ask
and tell
as well as fit
and predict
methods. All class attributes (used by these method) are also removed from the data models. Each functional entity is initialized using the corresponding data model. As an example, consider the following data model of a RandomStrategy
:
import bofire.data_models.domain.api as dm_domain\nimport bofire.data_models.features.api as dm_features\nimport bofire.data_models.strategies.api as dm_strategies\n\nin1 = dm_features.ContinuousInput(key=\"in1\", bounds=(0.0,1.0))\nin2 = dm_features.ContinuousInput(key=\"in2\", bounds=(0.0,2.0))\nin3 = dm_features.ContinuousInput(key=\"in3\", bounds=(0.0,3.0))\n\nout1 = dm_features.ContinuousOutput(key=\"out1\")\n\ninputs = dm_domain.Inputs(features=[in1, in2, in3])\noutputs = dm_domain.Outputs(features=[out1])\nconstraints = dm_domain.Constraints()\n\ndomain = dm_domain.Domain(\n inputs=inputs,\n outputs=outputs,\n constraints=constraints,\n)\n\ndata_model = dm_strategies.RandomStrategy(domain=domain)\n
Such a data model can be (de)serialized as follows:
import json\nfrom pydantic import parse_obj_as\nfrom bofire.data_models.strategies.api import AnyStrategy\n\nserialized = data_model.json()\ndata = json.loads(serialized)\n# alternative: data = data_model.dict()\ndata_model_ = parse_obj_as(AnyStrategy, data)\nassert data_model_ == data_model\n
Using this data model of a strategy, we can create an instance of a (functional) strategy:
import bofire.strategies.api as strategies\nstrategy = strategies.RandomStrategy(data_model=data_model)\n
As each strategy data model should be mapped to a specific (functional) strategy, we provide such a mapping:
strategy = strategies.map(data_model)\n
"},{"location":"design_with_explicit_formula/","title":"Design with explicit Formula","text":"In\u00a0[1]: Copied! from bofire.data_models.api import Domain, Inputs\nfrom bofire.data_models.features.api import ContinuousInput\nfrom bofire.strategies.doe.design import find_local_max_ipopt\nfrom formulaic import Formula\nfrom sklearn.preprocessing import MinMaxScaler\nimport itertools\nimport pandas as pd\nfrom bofire.utils.doe import get_confounding_matrix\nfrom bofire.data_models.api import Domain, Inputs from bofire.data_models.features.api import ContinuousInput from bofire.strategies.doe.design import find_local_max_ipopt from formulaic import Formula from sklearn.preprocessing import MinMaxScaler import itertools import pandas as pd from bofire.utils.doe import get_confounding_matrix
/opt/homebrew/Caskroom/miniforge/base/envs/bofire/lib/python3.10/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n from .autonotebook import tqdm as notebook_tqdm\nIn\u00a0[2]: Copied!
input_features=Inputs(\n features=[\n ContinuousInput(key=\"a\", bounds = (0,5)),\n ContinuousInput(key=\"b\", bounds= (40, 800)),\n ContinuousInput(key=\"c\", bounds= (80,180)),\n ContinuousInput(key=\"d\", bounds = (200,800)),\n ] \n )\ndomain = Domain(inputs=input_features)\ninput_features=Inputs( features=[ ContinuousInput(key=\"a\", bounds = (0,5)), ContinuousInput(key=\"b\", bounds= (40, 800)), ContinuousInput(key=\"c\", bounds= (80,180)), ContinuousInput(key=\"d\", bounds = (200,800)), ] ) domain = Domain(inputs=input_features) In\u00a0[3]: Copied!
model_type = Formula(\"a + {a**2} + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d\")\nmodel_type\nmodel_type = Formula(\"a + {a**2} + b + c + d + a:b + a:c + a:d + b:c + b:d + c:d\") model_type Out[3]:
1 + a + a**2 + b + c + d + a:b + a:c + a:d + b:c + b:d + c:dIn\u00a0[4]: Copied!
design = find_local_max_ipopt(domain=domain, model_type=model_type, n_experiments=17)\ndesign\ndesign = find_local_max_ipopt(domain=domain, model_type=model_type, n_experiments=17) design
\n******************************************************************************\nThis program contains Ipopt, a library for large-scale nonlinear optimization.\n Ipopt is released as open source code under the Eclipse Public License (EPL).\n For more information visit https://github.com/coin-or/Ipopt\n******************************************************************************\n\nOut[4]: a b c d exp0 5.000000e+00 40.000000 180.000002 199.999998 exp1 2.500000e+00 800.000008 79.999999 800.000008 exp2 -9.972222e-09 800.000008 180.000002 199.999998 exp3 5.000000e+00 800.000008 180.000002 800.000008 exp4 -9.975610e-09 40.000000 180.000002 199.999998 exp5 -9.975610e-09 800.000008 180.000002 800.000008 exp6 2.500000e+00 800.000008 180.000002 199.999998 exp7 5.000000e+00 40.000000 79.999999 800.000008 exp8 5.000000e+00 800.000008 79.999999 199.999998 exp9 -9.750000e-09 40.000000 79.999999 199.999998 exp10 -9.975610e-09 800.000008 79.999999 199.999998 exp11 -9.975610e-09 40.000000 79.999999 800.000008 exp12 5.000000e+00 800.000008 79.999999 800.000008 exp13 2.500000e+00 40.000000 180.000002 800.000008 exp14 5.000000e+00 40.000000 79.999999 199.999998 exp15 -9.972222e-09 800.000008 79.999999 800.000008 exp16 5.000000e+00 800.000008 180.000002 199.999998 In\u00a0[6]: Copied!
import matplotlib\nimport seaborn as sns\nimport matplotlib.pyplot as plt\n\nmatplotlib.rcParams[\"figure.dpi\"] = 120\n\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2,3], powers=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nimport matplotlib import seaborn as sns import matplotlib.pyplot as plt matplotlib.rcParams[\"figure.dpi\"] = 120 m = get_confounding_matrix(domain.inputs, design=design, interactions=[2,3], powers=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show() In\u00a0[\u00a0]: Copied!
\n"},{"location":"design_with_explicit_formula/#design-with-explicit-formula","title":"Design with explicit Formula\u00b6","text":"
This tutorial notebook shows how to setup a D-optimal design with BoFire while providing an explicit formula and not just one of the four available keywords linear
, linear-and-interaction
, linear-and-quadratic
, fully-quadratic
.
Make sure that cyipopt
is installed. The recommend way is the installation via conda conda install -c conda-forge cyipopt
.
This is a collection of code examples to allow for an easy exploration of the functionalities that BoFire offers.
"},{"location":"examples/#doe","title":"DoE","text":"import matplotlib.pyplot as plt\nimport pandas as pd\nimport seaborn as sns\n\nimport bofire.strategies.api as strategies\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.features.api import ContinuousInput\nfrom bofire.data_models.strategies.api import FractionalFactorialStrategy\nfrom bofire.utils.doe import get_confounding_matrix, get_generator, get_alias_structure\n\n\ndef plot_design(design: pd.DataFrame):\n # we do a plot with three subplots in one row in which the three degrees of freedom (temperature, time and ph) are plotted\n _, axs = plt.subplots(1, 3, figsize=(15, 5))\n axs[0].scatter(design['temperature'], design['time'])\n axs[0].set_xlabel('Temperature')\n axs[0].set_ylabel('Time')\n axs[1].scatter(design['temperature'], design['ph'])\n axs[1].set_xlabel('Temperature')\n axs[1].set_ylabel('pH')\n axs[2].scatter(design['time'], design['ph'])\n axs[2].set_xlabel('Time')\n axs[2].set_ylabel('pH')\n plt.show()\nimport matplotlib.pyplot as plt import pandas as pd import seaborn as sns import bofire.strategies.api as strategies from bofire.data_models.domain.api import Domain from bofire.data_models.features.api import ContinuousInput from bofire.data_models.strategies.api import FractionalFactorialStrategy from bofire.utils.doe import get_confounding_matrix, get_generator, get_alias_structure def plot_design(design: pd.DataFrame): # we do a plot with three subplots in one row in which the three degrees of freedom (temperature, time and ph) are plotted _, axs = plt.subplots(1, 3, figsize=(15, 5)) axs[0].scatter(design['temperature'], design['time']) axs[0].set_xlabel('Temperature') axs[0].set_ylabel('Time') axs[1].scatter(design['temperature'], design['ph']) axs[1].set_xlabel('Temperature') axs[1].set_ylabel('pH') axs[2].scatter(design['time'], design['ph']) axs[2].set_xlabel('Time') axs[2].set_ylabel('pH') plt.show() In\u00a0[2]: Copied!
domain = Domain(\n inputs=[\n ContinuousInput(key=\"temperature\", bounds=(20,80)),\n ContinuousInput(key=\"time\", bounds=(60,120)),\n ContinuousInput(key=\"ph\", bounds=(7,13)),\n ],\n)\ndomain = Domain( inputs=[ ContinuousInput(key=\"temperature\", bounds=(20,80)), ContinuousInput(key=\"time\", bounds=(60,120)), ContinuousInput(key=\"ph\", bounds=(7,13)), ], ) In\u00a0[3]: Copied!
strategy_data = FractionalFactorialStrategy(\n domain=domain,\n n_center=1, # number of center points\n n_repetitions=1, # number of repetitions, we do only one round here\n)\nstrategy = strategies.map(strategy_data)\ndesign = strategy.ask()\ndisplay(design)\n\nplot_design(design=design)\nstrategy_data = FractionalFactorialStrategy( domain=domain, n_center=1, # number of center points n_repetitions=1, # number of repetitions, we do only one round here ) strategy = strategies.map(strategy_data) design = strategy.ask() display(design) plot_design(design=design) ph temperature time 0 7.0 20.0 60.0 1 7.0 20.0 120.0 2 7.0 80.0 60.0 3 7.0 80.0 120.0 4 13.0 20.0 60.0 5 13.0 20.0 120.0 6 13.0 80.0 60.0 7 13.0 80.0 120.0 8 10.0 50.0 90.0
The confounding structure is shown below, as expected for a full factorial design, no confound is present.
In\u00a0[4]: Copied!m = get_confounding_matrix(domain.inputs, design=design, interactions=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show()
Here a fractional factorial design of the form $2^{3-1}$ is setup by specifying the number of generators (here 1). In comparison to the full factorial design with 9 candidates, it features only 5 experiments.
In\u00a0[5]: Copied!strategy_data = FractionalFactorialStrategy(\n domain=domain,\n n_center=1, # number of center points\n n_repetitions=1, # number of repetitions, we do only one round here\n n_generators=1, # number of generators, ie number of reducing factors\n)\nstrategy = strategies.map(strategy_data)\ndesign = strategy.ask()\ndisplay(design)\nstrategy_data = FractionalFactorialStrategy( domain=domain, n_center=1, # number of center points n_repetitions=1, # number of repetitions, we do only one round here n_generators=1, # number of generators, ie number of reducing factors ) strategy = strategies.map(strategy_data) design = strategy.ask() display(design) ph temperature time 0 7.0 20.0 120.0 1 7.0 80.0 60.0 2 13.0 20.0 60.0 3 13.0 80.0 120.0 4 10.0 50.0 90.0
The generator string is automatically generated by making use of the method get_generator
and specifying the total number of factors (here 3) and the number of generators (here 1).
get_generator(n_factors=3, n_generators=1)\nget_generator(n_factors=3, n_generators=1) Out[7]:
'a b ab'
As expected for a type III design the main effects are confounded with the two factor interactions:
In\u00a0[8]: Copied!m = get_confounding_matrix(domain.inputs, design=design, interactions=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show()
This can also be expressed by the so called alias structure that can be calculated as following:
In\u00a0[12]: Copied!get_alias_structure(\"a b ab\")\nget_alias_structure(\"a b ab\") Out[12]:
['a = bc', 'b = ac', 'c = ab', 'I = abc']
Here again a fractional factorial design of the form $2^{3-1}$ is setup by providing the complete generator string of the form a b -ab
explicitly to the strategy.
strategy_data = FractionalFactorialStrategy(\n domain=domain,\n n_center=1, # number of center points\n n_repetitions=1, # number of repetitions, we do only one round here\n generator = \"a b -ab\" # the exact generator\n)\nstrategy = strategies.map(strategy_data)\ndesign = strategy.ask()\ndisplay(design)\nstrategy_data = FractionalFactorialStrategy( domain=domain, n_center=1, # number of center points n_repetitions=1, # number of repetitions, we do only one round here generator = \"a b -ab\" # the exact generator ) strategy = strategies.map(strategy_data) design = strategy.ask() display(design) ph temperature time 0 7.0 20.0 60.0 1 7.0 80.0 120.0 2 13.0 20.0 120.0 3 13.0 80.0 60.0 4 10.0 50.0 90.0
The last two designs differ only in the last feature time
, since the generator strings are different. In the first one it holds time=ph x temperature
whereas in the second it holds time=-ph x temperature
, which is also reflected in the confounding structure.
m = get_confounding_matrix(domain.inputs, design=design, interactions=[2])\n\nsns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\")\nplt.show()\nm = get_confounding_matrix(domain.inputs, design=design, interactions=[2]) sns.heatmap(m, annot=True, annot_kws={\"fontsize\":7},fmt=\"2.1f\") plt.show()"},{"location":"fractional_factorial/#full-and-fractional-factorial-designs","title":"Full and Fractional Factorial Designs\u00b6","text":"
BoFire can be used to setup full (two level) and fractional factorial designs (https://en.wikipedia.org/wiki/Fractional_factorial_design). This tutorial notebook shows how.
"},{"location":"fractional_factorial/#imports-and-helper-functions","title":"Imports and helper functions\u00b6","text":""},{"location":"fractional_factorial/#setup-the-problem-domain","title":"Setup the problem domain\u00b6","text":"The designs are generated for a simple three dimensional problem comprised of three continuous factors/features.
"},{"location":"fractional_factorial/#setup-a-full-factorial-design","title":"Setup a full factorial design\u00b6","text":"Here we setup a full two-level factorial design including a center point and plot it.
"},{"location":"fractional_factorial/#setup-a-fractional-factorial-design","title":"Setup a fractional factorial design\u00b6","text":""},{"location":"getting_started/","title":"Getting started","text":"In\u00a0[1]: Copied!from bofire.data_models.features.api import ContinuousInput, DiscreteInput, CategoricalInput, CategoricalDescriptorInput\n\nx1 = ContinuousInput(key=\"x1\", bounds=(0,1))\nx2 = ContinuousInput(key=\"x2\", bounds=(0,1))\nx3 = ContinuousInput(key=\"x3\", bounds=(0,1))\nx4 = DiscreteInput(key=\"x4\", values=[1, 2, 5, 7.5])\nx5 = CategoricalInput(key=\"x5\", categories=[\"A\", \"B\", \"C\"], allowed=[True,True,False])\nx6 = CategoricalDescriptorInput(key=\"x6\", categories=[\"c1\", \"c2\", \"c3\"], descriptors=[\"d1\", \"d2\"], values = [[1,2],[2,5],[1,7]])\nfrom bofire.data_models.features.api import ContinuousInput, DiscreteInput, CategoricalInput, CategoricalDescriptorInput x1 = ContinuousInput(key=\"x1\", bounds=(0,1)) x2 = ContinuousInput(key=\"x2\", bounds=(0,1)) x3 = ContinuousInput(key=\"x3\", bounds=(0,1)) x4 = DiscreteInput(key=\"x4\", values=[1, 2, 5, 7.5]) x5 = CategoricalInput(key=\"x5\", categories=[\"A\", \"B\", \"C\"], allowed=[True,True,False]) x6 = CategoricalDescriptorInput(key=\"x6\", categories=[\"c1\", \"c2\", \"c3\"], descriptors=[\"d1\", \"d2\"], values = [[1,2],[2,5],[1,7]])
As output features, currently only continuous output features are supported. Each output feature should have an objective, which can be a minimize or maximize objective. Furthermore, we can define weights between 0 and 1 in case the objectives should not be weighted equally.
In\u00a0[2]: Copied!from bofire.data_models.features.api import ContinuousOutput\nfrom bofire.data_models.objectives.api import MaximizeObjective, MinimizeObjective\n\nobjective1 = MaximizeObjective(\n w=1.0, \n bounds= [0.0,1.0],\n)\ny1 = ContinuousOutput(key=\"y1\", objective=objective1)\n\nobjective2 = MinimizeObjective(\n w=1.0\n)\ny2 = ContinuousOutput(key=\"y2\", objective=objective2)\nfrom bofire.data_models.features.api import ContinuousOutput from bofire.data_models.objectives.api import MaximizeObjective, MinimizeObjective objective1 = MaximizeObjective( w=1.0, bounds= [0.0,1.0], ) y1 = ContinuousOutput(key=\"y1\", objective=objective1) objective2 = MinimizeObjective( w=1.0 ) y2 = ContinuousOutput(key=\"y2\", objective=objective2)
In- and output features are collected in respective feature lists.
In\u00a0[3]: Copied!from bofire.data_models.domain.api import Inputs, Outputs\n\ninput_features = Inputs(features = [x1, x2, x3, x4, x5, x6])\noutput_features = Outputs(features=[y1, y2])\nfrom bofire.data_models.domain.api import Inputs, Outputs input_features = Inputs(features = [x1, x2, x3, x4, x5, x6]) output_features = Outputs(features=[y1, y2])
A summary of the constraints can be obtained by the method get_reps_df
:
input_features.get_reps_df()\ninput_features.get_reps_df() Out[23]: Type Description x1 ContinuousInput [0.0,1.0] x2 ContinuousInput [0.0,1.0] x3 ContinuousInput [0.0,1.0] x4 DiscreteInput type='DiscreteInput' key='x4' unit=None values... x6 CategoricalDescriptorInput 3 categories x5 CategoricalInput 3 categories In\u00a0[24]: Copied!
output_features.get_reps_df()\noutput_features.get_reps_df() Out[24]: Type Description y1 ContinuousOutput ContinuousOutputFeature y2 ContinuousOutput ContinuousOutputFeature y3 ContinuousOutput ContinuousOutputFeature
Individual features can be retrieved by name.
In\u00a0[4]: Copied!x5 = input_features.get_by_key('x5')\nx5\nx5 = input_features.get_by_key('x5') x5 Out[4]:
CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])
This is also possible with list of feature names.
In\u00a0[5]: Copied!input_features.get_by_keys(['x5', 'x2'])\ninput_features.get_by_keys(['x5', 'x2']) Out[5]:
Inputs(type='Inputs', features=[ContinuousInput(type='ContinuousInput', key='x2', unit=None, bounds=(0.0, 1.0), local_relative_bounds=None, stepsize=None), CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])])
Features of a specific type can be returned by the get
method, by default it returns all features that are an instance of the provided class.
input_features.get(CategoricalInput)\ninput_features.get(CategoricalInput) Out[6]:
Inputs(type='Inputs', features=[CategoricalDescriptorInput(type='CategoricalDescriptorInput', key='x6', categories=['c1', 'c2', 'c3'], allowed=[True, True, True], descriptors=['d1', 'd2'], values=[[1.0, 2.0], [2.0, 5.0], [1.0, 7.0]]), CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])])
By using the exact
argument one can force it to only return feature of the exact same class.
input_features.get(CategoricalInput, exact=True)\ninput_features.get(CategoricalInput, exact=True) Out[7]:
Inputs(type='Inputs', features=[CategoricalInput(type='CategoricalInput', key='x5', categories=['A', 'B', 'C'], allowed=[True, True, False])])
The get_keys
method follows the same logic as the get
method but returns just the keys of the features instead of the features itself.
input_features.get_keys(CategoricalInput)\ninput_features.get_keys(CategoricalInput) Out[8]:
['x6', 'x5']
The input feature container further provides methods to return a feature container with only all fixed or all free features.
In\u00a0[9]: Copied!free_inputs = input_features.get_free()\nfixed_inputs = input_features.get_fixed()\nfree_inputs = input_features.get_free() fixed_inputs = input_features.get_fixed()
One can uniformly sample from individual input features.
In\u00a0[10]: Copied!x5.sample(2)\nx5.sample(2) Out[10]:
0 B\n1 A\nName: x5, dtype: object
Or directly from input feature containers, uniform, sobol and LHS sampling is possible. A default, uniform sampling is used.
In\u00a0[11]: Copied!from bofire.data_models.enum import SamplingMethodEnum\n\nX = input_features.sample(n=10, method=SamplingMethodEnum.LHS)\n\nX\nfrom bofire.data_models.enum import SamplingMethodEnum X = input_features.sample(n=10, method=SamplingMethodEnum.LHS) X Out[11]: x1 x2 x3 x4 x6 x5 0 0.423139 0.305001 0.881045 2.0 c3 A 1 0.873972 0.525925 0.674935 7.5 c3 A 2 0.782031 0.867259 0.442600 2.0 c1 B 3 0.691130 0.403864 0.348524 7.5 c3 B 4 0.051185 0.733657 0.144178 1.0 c2 A 5 0.939134 0.199665 0.226415 1.0 c1 A 6 0.323216 0.912386 0.066617 1.0 c1 B 7 0.280553 0.208415 0.544485 7.5 c3 A 8 0.163496 0.022924 0.707360 5.0 c2 B 9 0.554554 0.673069 0.938194 5.0 c1 B In\u00a0[12]: Copied!
from bofire.data_models.constraints.api import LinearEqualityConstraint, LinearInequalityConstraint\n\n# A mixture: x1 + x2 + x3 = 1\nconstr1 = LinearEqualityConstraint(features=[\"x1\", \"x2\", \"x3\"], coefficients=[1,1,1], rhs=1)\n\n# x1 + 2 * x3 < 0.8\nconstr2 = LinearInequalityConstraint(features=[\"x1\", \"x3\"], coefficients=[1, 2], rhs=0.8)\nfrom bofire.data_models.constraints.api import LinearEqualityConstraint, LinearInequalityConstraint # A mixture: x1 + x2 + x3 = 1 constr1 = LinearEqualityConstraint(features=[\"x1\", \"x2\", \"x3\"], coefficients=[1,1,1], rhs=1) # x1 + 2 * x3 < 0.8 constr2 = LinearInequalityConstraint(features=[\"x1\", \"x3\"], coefficients=[1, 2], rhs=0.8)
Linear constraints can only operate on ContinuousInput
features.
NonlinearEqualityConstraint
and NonlinearInequalityConstraint
take any expression that can be evaluated by pandas.eval, including mathematical operators such as sin
, exp
, log10
or exponentiation. So far, they cannot be used in any optimizations.
from bofire.data_models.constraints.api import NonlinearEqualityConstraint, NonlinearInequalityConstraint\n\n# The unit circle: x1**2 + x2**2 = 1\nconst3 = NonlinearEqualityConstraint(expression=\"x1**2 + x2**2 - 1\")\nconst3\nfrom bofire.data_models.constraints.api import NonlinearEqualityConstraint, NonlinearInequalityConstraint # The unit circle: x1**2 + x2**2 = 1 const3 = NonlinearEqualityConstraint(expression=\"x1**2 + x2**2 - 1\") const3 Out[13]:
NonlinearEqualityConstraint(type='NonlinearEqualityConstraint', expression='x1**2 + x2**2 - 1', features=None, jacobian_expression=None)In\u00a0[14]: Copied!
from bofire.data_models.constraints.api import NChooseKConstraint\n\n# Only 2 or 3 out of 3 parameters can be greater than zero\nconstr5 = NChooseKConstraint(features=[\"x1\", \"x2\", \"x3\"], min_count=2, max_count=3, none_also_valid=True)\nconstr5\nfrom bofire.data_models.constraints.api import NChooseKConstraint # Only 2 or 3 out of 3 parameters can be greater than zero constr5 = NChooseKConstraint(features=[\"x1\", \"x2\", \"x3\"], min_count=2, max_count=3, none_also_valid=True) constr5 Out[14]:
NChooseKConstraint(type='NChooseKConstraint', features=['x1', 'x2', 'x3'], min_count=2, max_count=3, none_also_valid=True)
Note that we have to set a boolean, if None is also a valid selection, e.g. if we want to have 2 or 3 or none of the ingredients in our recipe.
Similar to the features, constraints can be grouped in a container which acts as the union constraints.
In\u00a0[15]: Copied!from bofire.data_models.domain.api import Constraints\n\n\nconstraints = Constraints(constraints=[constr1, constr2])\nfrom bofire.data_models.domain.api import Constraints constraints = Constraints(constraints=[constr1, constr2])
A summary of the constraints can be obtained by the method get_reps_df
:
constraints.get_reps_df()\nconstraints.get_reps_df() Out[22]: Type Description 0 LinearEqualityConstraint type='LinearEqualityConstraint' features=['x1'... 1 LinearInequalityConstraint type='LinearInequalityConstraint' features=['x...
We can check whether a point satisfies individual constraints or the list of constraints.
In\u00a0[16]: Copied!constr2.is_fulfilled(X).values\nconstr2.is_fulfilled(X).values Out[16]:
array([False, False, False, False, True, False, True, False, False,\n False])
Output constraints can be setup via sigmoid-shaped objectives passed as argument to the respective feature, which can then also be plotted.
In\u00a0[17]: Copied!from bofire.data_models.objectives.api import MinimizeSigmoidObjective\nfrom bofire.plot.api import plot_objective_plotly\n\noutput_constraint = MinimizeSigmoidObjective(\n w=1.0, \n steepness=10,\n tp=0.5\n)\ny3= ContinuousOutput(key=\"y3\", objective=output_constraint)\n\noutput_features = Outputs(features=[y1, y2, y3])\n\nfig = plot_objective_plotly(feature=y3, lower=0, upper=1)\n\nfig.show()\nfrom bofire.data_models.objectives.api import MinimizeSigmoidObjective from bofire.plot.api import plot_objective_plotly output_constraint = MinimizeSigmoidObjective( w=1.0, steepness=10, tp=0.5 ) y3= ContinuousOutput(key=\"y3\", objective=output_constraint) output_features = Outputs(features=[y1, y2, y3]) fig = plot_objective_plotly(feature=y3, lower=0, upper=1) fig.show()
/opt/homebrew/Caskroom/miniforge/base/envs/bofire-2/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n from .autonotebook import tqdm as notebook_tqdm\nIn\u00a0[18]: Copied!
from bofire.data_models.domain.api import Domain\n\ndomain = Domain(\n inputs=input_features, \n outputs=output_features, \n constraints=constraints\n )\nfrom bofire.data_models.domain.api import Domain domain = Domain( inputs=input_features, outputs=output_features, constraints=constraints )
In addition one can instantiate the domain also just from lists.
In\u00a0[19]: Copied!domain_single_objective = Domain.from_lists(\n inputs=[x1, x2, x3, x4, x5, x6], \n outputs=[y1], \n constraints=[]\n )\ndomain_single_objective = Domain.from_lists( inputs=[x1, x2, x3, x4, x5, x6], outputs=[y1], constraints=[] ) In\u00a0[22]: Copied!
from bofire.data_models.strategies.api import RandomStrategy\n\nimport bofire.strategies.api as strategies\n\nstrategy_data_model = RandomStrategy(domain=domain)\n\nrandom_strategy = strategies.map(strategy_data_model)\nrandom_candidates = random_strategy.ask(2)\n\nrandom_candidates\nfrom bofire.data_models.strategies.api import RandomStrategy import bofire.strategies.api as strategies strategy_data_model = RandomStrategy(domain=domain) random_strategy = strategies.map(strategy_data_model) random_candidates = random_strategy.ask(2) random_candidates Out[22]: x1 x2 x3 x4 x6 x5 0 0.516301 0.358447 0.125253 7.5 c3 A 1 0.246566 0.636906 0.116528 2.0 c1 B In\u00a0[2]: Copied!
from bofire.benchmarks.single import Himmelblau\n\nbenchmark = Himmelblau()\n\n(benchmark.domain.inputs + benchmark.domain.outputs).get_reps_df()\nfrom bofire.benchmarks.single import Himmelblau benchmark = Himmelblau() (benchmark.domain.inputs + benchmark.domain.outputs).get_reps_df() Out[2]: Type Description x_1 ContinuousInput [-6.0,6.0] x_2 ContinuousInput [-6.0,6.0] y ContinuousOutput ContinuousOutputFeature
Generating some initial data works as follows:
In\u00a0[24]: Copied!samples = benchmark.domain.inputs.sample(10)\n\nexperiments = benchmark.f(samples, return_complete=True)\n\nexperiments\nsamples = benchmark.domain.inputs.sample(10) experiments = benchmark.f(samples, return_complete=True) experiments Out[24]: x_1 x_2 y valid_y 0 -5.207328 3.267036 378.064959 1 1 -3.542455 5.285482 349.256442 1 2 -5.155535 5.077326 612.311571 1 3 -5.316850 3.642571 438.194554 1 4 -3.701859 -5.987050 642.945914 1 5 -1.165247 -0.212096 163.045785 1 6 3.267629 2.292458 6.199849 1 7 -0.915547 1.141966 125.068321 1 8 -2.672275 -1.027612 98.118896 1 9 5.363115 -4.279275 459.876833 1
Let's setup the SOBO strategy and ask for a candidate.
In\u00a0[25]: Copied!from bofire.data_models.strategies.api import SoboStrategy\nfrom bofire.data_models.acquisition_functions.api import qNEI\n\nsobo_strategy_data_model = SoboStrategy(domain=benchmark.domain, acquisition_function=qNEI())\n\nsobo_strategy = strategies.map(sobo_strategy_data_model)\n\nsobo_strategy.tell(experiments=experiments)\n\nsobo_strategy.ask(candidate_count=1)\nfrom bofire.data_models.strategies.api import SoboStrategy from bofire.data_models.acquisition_functions.api import qNEI sobo_strategy_data_model = SoboStrategy(domain=benchmark.domain, acquisition_function=qNEI()) sobo_strategy = strategies.map(sobo_strategy_data_model) sobo_strategy.tell(experiments=experiments) sobo_strategy.ask(candidate_count=1) Out[25]: x_1 x_2 y_pred y_sd y_des 0 2.185807 5.14596 48.612437 208.728779 -48.612437 In\u00a0[26]: Copied!
from bofire.strategies.doe.design import find_local_max_ipopt\nimport numpy as np\n\ndomain = Domain(\n inputs=[x1,x2,x3],\n outputs=[y1],\n constraints=[constr1]\n )\n\nres = find_local_max_ipopt(domain, \"fully-quadratic\")\nnp.round(res,3)\nfrom bofire.strategies.doe.design import find_local_max_ipopt import numpy as np domain = Domain( inputs=[x1,x2,x3], outputs=[y1], constraints=[constr1] ) res = find_local_max_ipopt(domain, \"fully-quadratic\") np.round(res,3)
\n******************************************************************************\nThis program contains Ipopt, a library for large-scale nonlinear optimization.\n Ipopt is released as open source code under the Eclipse Public License (EPL).\n For more information visit https://github.com/coin-or/Ipopt\n******************************************************************************\n\nOut[26]: x1 x2 x3 exp0 0.5 0.5 -0.0 exp1 -0.0 1.0 -0.0 exp2 -0.0 0.5 0.5 exp3 -0.0 0.5 0.5 exp4 0.5 -0.0 0.5 exp5 0.5 0.5 -0.0 exp6 -0.0 1.0 -0.0 exp7 1.0 -0.0 -0.0 exp8 -0.0 -0.0 1.0 exp9 -0.0 -0.0 1.0 exp10 0.5 -0.0 0.5 exp11 0.5 -0.0 0.5 exp12 0.5 0.5 -0.0
The resulting design looks like this:
In\u00a0[27]: Copied!import matplotlib.pyplot as plt\n\nfig = plt.figure(figsize=((10,10)))\nax = fig.add_subplot(111, projection='3d')\nax.view_init(45, 45)\nax.set_title(\"fully-quadratic model\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\nplt.rcParams[\"figure.figsize\"] = (10,8)\n\n#plot feasible polytope\nax.plot(\n xs=[1,0,0,1],\n ys=[0,1,0,0],\n zs=[0,0,1,0],\n linewidth=2\n)\n\n#plot D-optimal solutions\nax.scatter(xs=res[\"x1\"], ys=res[\"x2\"], zs=res[\"x3\"], marker=\"o\", s=40, color=\"orange\")\nimport matplotlib.pyplot as plt fig = plt.figure(figsize=((10,10))) ax = fig.add_subplot(111, projection='3d') ax.view_init(45, 45) ax.set_title(\"fully-quadratic model\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") plt.rcParams[\"figure.figsize\"] = (10,8) #plot feasible polytope ax.plot( xs=[1,0,0,1], ys=[0,1,0,0], zs=[0,0,1,0], linewidth=2 ) #plot D-optimal solutions ax.scatter(xs=res[\"x1\"], ys=res[\"x2\"], zs=res[\"x3\"], marker=\"o\", s=40, color=\"orange\") Out[27]:
<mpl_toolkits.mplot3d.art3d.Path3DCollection at 0x17d571330>In\u00a0[\u00a0]: Copied!
\n"},{"location":"getting_started/#getting-started","title":"Getting started\u00b6","text":"
In the following it is showed how to setup optimization problems in BoFire and how to use strategies to solve them.
"},{"location":"getting_started/#setting-up-the-optimization-problem","title":"Setting up the optimization problem\u00b6","text":"In BoFire, an optimization problem is defined by defining a domain containing input and output features as well as constraints (optional).
"},{"location":"getting_started/#features","title":"Features\u00b6","text":"Input features can be continuous, discrete, categorical, or categorical with descriptors:
"},{"location":"getting_started/#constraints","title":"Constraints\u00b6","text":"The search space can be further defined by constraints on the input features. BoFire supports linear equality and inequality constraints, as well as non-linear equality and inequality constraints.
"},{"location":"getting_started/#linear-constraints","title":"Linear constraints\u00b6","text":"LinearEqualityConstraint
and LinearInequalityConstraint
are expressions of the form $\\sum_i a_i x_i = b$ or $\\leq b$ for equality and inequality constraints respectively. They take a list of names of the input features they are operating on, a list of left-hand-side coefficients $a_i$ and a right-hand-side constant $b$.
Use NChooseKConstraint
to express that we only want to have $k$ out of the $n$ parameters to take positive values. Think of a mixture, where we have long list of possible ingredients, but want to limit number of ingredients in any given recipe.
The domain holds then all information about an optimization problem and can be understood as a search space defintion.
"},{"location":"getting_started/#optimization","title":"Optimization\u00b6","text":"To solve the optimization problem, we further need a solving strategy. BoFire supports strategies without a prediction model such as a random strategy and predictive strategies which are based on a prediction model.
All strategies contain an ask
method returning a defined number of candidate experiments.
Since a predictive strategy includes a prediction model, we need to generate some historical data, which we can afterwards pass as training data to the strategy via the tell method.
For didactic purposes we just choose here from one of our benchmark methods.
"},{"location":"getting_started/#design-of-experiments","title":"Design of Experiments\u00b6","text":"As a simple example for the DoE functionalities we consider the task of finding a D-optimal design for a fully-quadratic model with three design variables with bounds (0,1) and a mixture constraint.
We define the design space including the constraint as a domain. Then we pass it to the optimization routine and specify the model. If the user does not indicate a number of experiments it will be chosen automatically based on the number of model terms.
"},{"location":"install/","title":"Installation","text":"In BoFire we have several optional depencies.
"},{"location":"install/#domain-and-optimization-algorithms","title":"Domain and Optimization Algorithms","text":"To install BoFire with optimization tools you can use
pip install bofire[optimization]\n
This will also install BoTorch that depends on PyTorch."},{"location":"install/#design-of-experiments","title":"Design of Experiments","text":"BoFire has functionality to create D-optimal experimental designs via the doe
module. This module is depends on Cyipopt. A comfortable way to install Cyipopt and the dependencies is via
conda install -c conda-forge cyipopt\n
You have to install Cyipopt manually."},{"location":"install/#just-domain","title":"Just Domain","text":"If you just want a data structure that represents the domain of an optimization problem you can
pip install bofire\n
"},{"location":"install/#cheminformatics","title":"Cheminformatics","text":"Some features related to molecules and their representation depend on Rdkit.
pip install bofire[optimization,cheminfo]\n
"},{"location":"install/#development-installation","title":"Development Installation","text":"If you want to contribute to BoFire, you might want to install in editable mode including the test dependencies. After cloning the repository via
git clone https://github.com/experimental-design/bofire.git\n
and cd bofire
, you can proceed with pip install -e .[optimization,cheminfo,docs,tests]\n
"},{"location":"nchoosek_constraint/","title":"Nchoosek constraint","text":"In\u00a0[10]: Copied! from bofire.strategies.doe.design import find_local_max_ipopt\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.constraints.api import NChooseKConstraint, LinearEqualityConstraint, LinearInequalityConstraint\nfrom bofire.data_models.features.api import ContinuousInput, ContinuousOutput\nimport numpy as np\n\ndomain = Domain(\n inputs = [ContinuousInput(key=f\"x{i+1}\", bounds=(0,1)) for i in range(8)],\n outputs = [ContinuousOutput(key=\"y\")],\n constraints = [\n LinearEqualityConstraint(features=[f\"x{i+1}\" for i in range(8)], coefficients=[1,1,1,1,1,1,1,1], rhs=1),\n NChooseKConstraint(features=[\"x1\",\"x2\",\"x3\"], min_count=0, max_count=1, none_also_valid=True),\n LinearInequalityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=0.7),\n LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[-1,-1], rhs=-0.1),\n LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[1,1], rhs=0.9),\n ]\n)\n\nres = find_local_max_ipopt(\n domain=domain,\n model_type=\"fully-quadratic\",\n ipopt_options={\"maxiter\":500},\n)\nnp.round(res,3)\nfrom bofire.strategies.doe.design import find_local_max_ipopt from bofire.data_models.domain.api import Domain from bofire.data_models.constraints.api import NChooseKConstraint, LinearEqualityConstraint, LinearInequalityConstraint from bofire.data_models.features.api import ContinuousInput, ContinuousOutput import numpy as np domain = Domain( inputs = [ContinuousInput(key=f\"x{i+1}\", bounds=(0,1)) for i in range(8)], outputs = [ContinuousOutput(key=\"y\")], constraints = [ LinearEqualityConstraint(features=[f\"x{i+1}\" for i in range(8)], coefficients=[1,1,1,1,1,1,1,1], rhs=1), NChooseKConstraint(features=[\"x1\",\"x2\",\"x3\"], min_count=0, max_count=1, none_also_valid=True), LinearInequalityConstraint(features=[\"x1\",\"x2\",\"x3\"], coefficients=[1,1,1], rhs=0.7), LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[-1,-1], rhs=-0.1), LinearInequalityConstraint(features=[\"x7\",\"x8\"], coefficients=[1,1], rhs=0.9), ] ) res = find_local_max_ipopt( domain=domain, model_type=\"fully-quadratic\", ipopt_options={\"maxiter\":500}, ) np.round(res,3) Out[10]: x1 x2 x3 x4 x5 x6 x7 x8 exp0 0.000 0.00 0.463 -0.000 0.437 -0.000 0.100 -0.000 exp1 0.449 0.00 0.000 -0.000 -0.000 -0.000 -0.000 0.551 exp2 0.000 0.50 0.000 -0.000 -0.000 -0.000 -0.000 0.500 exp3 0.000 0.00 0.700 0.200 -0.000 -0.000 -0.000 0.100 exp4 0.394 0.00 0.000 -0.000 0.506 -0.000 0.100 -0.000 exp5 0.000 0.45 0.000 -0.000 -0.000 0.450 0.029 0.071 exp6 0.000 0.00 0.700 -0.000 -0.000 -0.000 0.300 -0.000 exp7 0.700 0.00 0.000 -0.000 0.200 -0.000 -0.000 0.100 exp8 0.000 -0.00 0.000 -0.000 0.448 -0.000 0.552 -0.000 exp9 0.000 0.00 -0.000 -0.000 0.498 -0.000 -0.000 0.502 exp10 -0.000 0.00 0.000 -0.000 -0.000 0.900 0.100 -0.000 exp11 0.000 -0.00 0.000 -0.000 0.900 -0.000 0.100 -0.000 exp12 0.000 0.00 0.371 -0.000 -0.000 0.529 -0.000 0.100 exp13 0.700 0.00 0.000 -0.000 -0.000 0.200 0.100 -0.000 exp14 0.000 -0.00 0.000 0.100 -0.000 -0.000 0.900 -0.000 exp15 0.000 0.00 0.100 -0.000 -0.000 -0.000 0.443 0.457 exp16 -0.000 0.00 0.000 -0.000 0.450 0.450 0.043 0.057 exp17 0.000 0.70 0.000 -0.000 -0.000 -0.000 0.300 -0.000 exp18 0.000 0.00 -0.000 -0.000 -0.000 0.445 0.555 -0.000 exp19 -0.000 0.00 0.000 0.539 -0.000 -0.000 0.461 -0.000 exp20 0.000 0.35 0.000 -0.000 -0.000 -0.000 0.650 -0.000 exp21 0.000 0.00 0.404 -0.000 -0.000 0.496 0.100 -0.000 exp22 0.491 0.00 0.000 -0.000 -0.000 -0.000 0.509 -0.000 exp23 0.000 0.35 0.000 -0.000 -0.000 -0.000 0.650 -0.000 exp24 0.000 0.00 0.446 -0.000 -0.000 -0.000 -0.000 0.554 exp25 0.384 0.00 0.000 -0.000 -0.000 0.516 -0.000 0.100 exp26 0.000 0.45 0.000 0.450 -0.000 -0.000 0.028 0.072 exp27 0.000 0.00 0.440 -0.000 0.460 -0.000 -0.000 0.100 exp28 0.393 0.00 0.000 0.507 -0.000 -0.000 0.100 -0.000 exp29 0.000 -0.00 0.000 0.450 0.450 -0.000 0.049 0.051 exp30 0.000 0.00 0.700 -0.000 -0.000 0.200 -0.000 0.100 exp31 0.100 0.00 0.000 -0.000 -0.000 -0.000 0.454 0.446 exp32 0.000 -0.00 0.000 -0.000 0.448 -0.000 0.552 -0.000 exp33 0.000 0.00 0.374 -0.000 -0.000 -0.000 0.626 -0.000 exp34 0.388 0.00 0.000 -0.000 -0.000 0.512 0.100 -0.000 exp35 0.000 -0.00 0.000 0.455 -0.000 0.445 0.100 -0.000 exp36 0.000 0.00 0.394 0.506 -0.000 -0.000 0.100 -0.000 exp37 -0.000 0.00 0.000 0.448 -0.000 -0.000 -0.000 0.552 exp38 0.000 0.45 0.000 -0.000 0.450 -0.000 0.023 0.077 exp39 0.000 0.00 -0.000 0.539 -0.000 -0.000 0.461 -0.000 exp40 -0.000 0.00 0.000 -0.000 -0.000 0.445 0.555 -0.000 exp41 0.000 -0.00 0.000 -0.000 -0.000 0.541 -0.000 0.459 exp42 0.000 0.00 -0.000 0.442 -0.000 0.458 -0.000 0.100 exp43 0.700 0.00 0.000 0.200 -0.000 -0.000 -0.000 0.100 exp44 0.000 -0.00 0.000 -0.000 -0.000 0.100 -0.000 0.900 exp45 0.000 0.00 -0.000 0.448 -0.000 -0.000 -0.000 0.552 exp46 -0.000 0.00 0.000 0.900 -0.000 -0.000 -0.000 0.100 exp47 0.000 -0.00 0.000 -0.000 0.498 -0.000 -0.000 0.502 In\u00a0[\u00a0]: Copied!
\n"},{"location":"nchoosek_constraint/#design-with-nchoosek-constraint","title":"Design with NChooseK constraint\u00b6","text":"
The doe subpackage also supports problems with NChooseK constraints. Since IPOPT has problems finding feasible solutions using the gradient of the NChooseK constraint violation, a closely related (but stricter) constraint that suffices to fulfill the NChooseK constraint is imposed onto the problem: For each experiment $j$ N-K decision variables $x_{i_1,j},...,x_{i_{N-K,j}}$ from the NChooseK constraints' names attribute are picked that are forced to be zero. This is done by setting the upper and lower bounds of the picked variables are set to 0 in the corresponding experiments. This causes IPOPT to treat them as \"fixed variables\" (i.e. it will not optimize for them) and will always stick to the only feasible value (which is 0 here). However, this constraint is stricter than the original NChooseK constraint. In combination with other constraints on the same decision variables this can result in a situation where the constraints cannot be fulfilled even though the original constraints would allow for a solution. For example consider a problem with four decision variables $x_1, x_2, x_3, x_4$, an NChooseK constraint on the first four variable that restricts the number of nonzero variables to two. Additionally, we have a linear constraint $$ x_3 + x_4 \\geq 0.1 $$ We can easily find points that fulfill both constraints (e.g. $(0,0,0,0.1)$). Now consider the stricter, linear constraint from above. Eventually, it will happen that $x_3$ and $x_4$ are chosen to be zero for one experiment. For this experiment it is impossible to fulfill the linear constraint $x_3 + x_4 \\geq 0.1$ since $x_3 = x_4 = 0$.
Therefore one has to be very careful when imposing linear constraints upon decision variables that already show up in an NChooseK constraint.
For practical reasons it necessary that two NChooseK constraints of the same problem must not share any variables.
You can find an example for a problem with NChooseK constraints and additional linear constraints imposed on the same variables.
"},{"location":"optimality_criteria/","title":"Optimality criteria","text":"In\u00a0[1]: Copied!import numpy as np\nimport matplotlib.pyplot as plt\n\nfrom bofire.data_models.constraints.api import (\n NonlinearEqualityConstraint,\n NonlinearInequalityConstraint,\n LinearEqualityConstraint,\n LinearInequalityConstraint,\n)\nfrom bofire.data_models.domain.api import Domain\nfrom bofire.data_models.features.api import ContinuousInput, ContinuousOutput\nfrom bofire.strategies.doe.design import find_local_max_ipopt\nfrom bofire.strategies.enum import OptimalityCriterionEnum\nimport numpy as np import matplotlib.pyplot as plt from bofire.data_models.constraints.api import ( NonlinearEqualityConstraint, NonlinearInequalityConstraint, LinearEqualityConstraint, LinearInequalityConstraint, ) from bofire.data_models.domain.api import Domain from bofire.data_models.features.api import ContinuousInput, ContinuousOutput from bofire.strategies.doe.design import find_local_max_ipopt from bofire.strategies.enum import OptimalityCriterionEnum
/opt/homebrew/Caskroom/miniforge/base/envs/bofire/lib/python3.10/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n from .autonotebook import tqdm as notebook_tqdm\nIn\u00a0[2]: Copied!
# Optimal designs for a quadratic model on the unit square\ndomain = Domain(\n inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(2)],\n outputs=[ContinuousOutput(key=\"y\")],\n)\nmodel_type = \"fully-quadratic\"\nn_experiments = 13\n\ndesigns = {}\nfor obj in OptimalityCriterionEnum:\n designs[obj.value] = find_local_max_ipopt(\n domain,\n model_type=model_type,\n n_experiments=n_experiments,\n objective=obj,\n ipopt_options={\"maxiter\": 300},\n ).to_numpy()\n\nfig = plt.figure(figsize=((8, 8)))\nax = fig.add_subplot(111)\nax.set_title(\"Designs with different optimality criteria\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nfor obj, X in designs.items():\n ax.scatter(X[:, 0], X[:, 1], s=40, label=obj)\nax.grid(alpha=0.3)\nax.legend();\n# Optimal designs for a quadratic model on the unit square domain = Domain( inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(2)], outputs=[ContinuousOutput(key=\"y\")], ) model_type = \"fully-quadratic\" n_experiments = 13 designs = {} for obj in OptimalityCriterionEnum: designs[obj.value] = find_local_max_ipopt( domain, model_type=model_type, n_experiments=n_experiments, objective=obj, ipopt_options={\"maxiter\": 300}, ).to_numpy() fig = plt.figure(figsize=((8, 8))) ax = fig.add_subplot(111) ax.set_title(\"Designs with different optimality criteria\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") for obj, X in designs.items(): ax.scatter(X[:, 0], X[:, 1], s=40, label=obj) ax.grid(alpha=0.3) ax.legend();
\n******************************************************************************\nThis program contains Ipopt, a library for large-scale nonlinear optimization.\n Ipopt is released as open source code under the Eclipse Public License (EPL).\n For more information visit https://github.com/coin-or/Ipopt\n******************************************************************************\n\nIn\u00a0[3]: Copied!
# Space filling design on the unit 2-simplex\ndomain = Domain(\n inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(3)],\n outputs=[ContinuousOutput(key=\"y\")],\n constraints=[\n LinearEqualityConstraint(\n features=[\"x1\", \"x2\", \"x3\"], coefficients=[1, 1, 1], rhs=1\n )\n ],\n)\n\nX = find_local_max_ipopt(\n domain,\n n_experiments=40,\n model_type=\"linear\", # the model type does not matter for space filling designs\n objective=OptimalityCriterionEnum.SPACE_FILLING,\n ipopt_options={\"maxiter\": 500},\n).to_numpy()\n\n\nfig = plt.figure(figsize=((10, 8)))\nax = fig.add_subplot(111, projection=\"3d\")\nax.view_init(45, 20)\nax.set_title(\"Space filling design\")\nax.set_xlabel(\"$x_1$\")\nax.set_ylabel(\"$x_2$\")\nax.set_zlabel(\"$x_3$\")\n\n# plot feasible polytope\nax.plot(xs=[0, 0, 1, 0], ys=[0, 1, 0, 0], zs=[1, 0, 0, 1], linewidth=2)\n\n# plot design points\nax.scatter(xs=X[:, 0], ys=X[:, 1], zs=X[:, 2], s=40)\n# Space filling design on the unit 2-simplex domain = Domain( inputs=[ContinuousInput(key=f\"x{i+1}\", bounds=(0, 1)) for i in range(3)], outputs=[ContinuousOutput(key=\"y\")], constraints=[ LinearEqualityConstraint( features=[\"x1\", \"x2\", \"x3\"], coefficients=[1, 1, 1], rhs=1 ) ], ) X = find_local_max_ipopt( domain, n_experiments=40, model_type=\"linear\", # the model type does not matter for space filling designs objective=OptimalityCriterionEnum.SPACE_FILLING, ipopt_options={\"maxiter\": 500}, ).to_numpy() fig = plt.figure(figsize=((10, 8))) ax = fig.add_subplot(111, projection=\"3d\") ax.view_init(45, 20) ax.set_title(\"Space filling design\") ax.set_xlabel(\"$x_1$\") ax.set_ylabel(\"$x_2$\") ax.set_zlabel(\"$x_3$\") # plot feasible polytope ax.plot(xs=[0, 0, 1, 0], ys=[0, 1, 0, 0], zs=[1, 0, 0, 1], linewidth=2) # plot design points ax.scatter(xs=X[:, 0], ys=X[:, 1], zs=X[:, 2], s=40) Out[3]:
<mpl_toolkits.mplot3d.art3d.Path3DCollection at 0x2ac85e170>In\u00a0[\u00a0]: Copied!
\n"},{"location":"optimality_criteria/#designs-for-different-optimality-criteria","title":"Designs for different optimality criteria\u00b6","text":""},{"location":"optimality_criteria/#space-filling-design","title":"Space filling design\u00b6","text":""},{"location":"ref-constraints/","title":"Domain","text":""},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints","title":"
Constraints (BaseModel, Generic)
","text":"Source code in bofire/data_models/domain/constraints.py
class Constraints(BaseModel, Generic[C]):\n type: Literal[\"Constraints\"] = \"Constraints\"\n constraints: Sequence[C] = Field(default_factory=lambda: [])\n\n def __iter__(self) -> Iterator[C]:\n return iter(self.constraints)\n\n def __len__(self):\n return len(self.constraints)\n\n def __getitem__(self, i) -> C:\n return self.constraints[i]\n\n def __add__(\n self, other: Union[Sequence[CIncludes], \"Constraints[CIncludes]\"]\n ) -> \"Constraints[Union[C, CIncludes]]\":\n if isinstance(other, collections.abc.Sequence):\n other_constraints = other\n else:\n other_constraints = other.constraints\n constraints = list(chain(self.constraints, other_constraints))\n return Constraints(constraints=constraints)\n\n def __call__(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Numerically evaluate all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint on\n\n Returns:\n pd.DataFrame: Constraint evaluation for each of the constraints\n \"\"\"\n return pd.concat([c(experiments) for c in self.constraints], axis=1)\n\n def jacobian(self, experiments: pd.DataFrame) -> list:\n \"\"\"Numerically evaluate the jacobians of all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint jacobians on\n\n Returns:\n list: A list containing the jacobians as pd.DataFrames\n \"\"\"\n return [c.jacobian(experiments) for c in self.constraints]\n\n def is_fulfilled(self, experiments: pd.DataFrame, tol: float = 1e-6) -> pd.Series:\n \"\"\"Check if all constraints are fulfilled on all rows of the provided dataframe\n\n Args:\n experiments (pd.DataFrame): Dataframe with data, the constraint validity should be tested on\n tol (float, optional): tolerance parameter. A constraint is considered as not fulfilled if\n the violation is larger than tol. Defaults to 0.\n\n Returns:\n Boolean: True if all constraints are fulfilled for all rows, false if not\n \"\"\"\n if len(self.constraints) == 0:\n return pd.Series([True] * len(experiments), index=experiments.index)\n return (\n pd.concat(\n [c.is_fulfilled(experiments, tol) for c in self.constraints], axis=1\n )\n .fillna(True)\n .all(axis=1)\n )\n\n def get(\n self,\n includes: Union[Type[CIncludes], Sequence[Type[CIncludes]]] = Constraint,\n excludes: Optional[Union[Type[CExcludes], List[Type[CExcludes]]]] = None,\n exact: bool = False,\n ) -> \"Constraints[CIncludes]\":\n \"\"\"Get constraints of the domain\n\n Args:\n includes: Constraint class or list of specific constraint classes to be returned. Defaults to Constraint.\n excludes: Constraint class or list of specific constraint classes to be excluded from the return. Defaults to None.\n exact: Boolean to distinguish if only the exact class listed in includes and no subclasses inherenting from this class shall be returned. Defaults to False.\n\n Returns:\n Constraints: constraints in the domain fitting to the passed requirements.\n \"\"\"\n return Constraints(\n constraints=filter_by_class(\n self.constraints,\n includes=includes,\n excludes=excludes,\n exact=exact,\n )\n )\n\n def get_reps_df(self):\n \"\"\"Provides a tabular overwiev of all constraints within the domain\n\n Returns:\n pd.DataFrame: DataFrame listing all constraints of the domain with a description\n \"\"\"\n df = pd.DataFrame(\n index=range(len(self.constraints)),\n columns=[\"Type\", \"Description\"],\n data={\n \"Type\": [feat.__class__.__name__ for feat in self.get(Constraint)],\n \"Description\": [\n constraint.__str__() for constraint in self.get(Constraint)\n ],\n },\n )\n return df\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.__call__","title":"__call__(self, experiments)
special
","text":"Numerically evaluate all constraints
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
data to evaluate the constraint on
requiredReturns:
Type Descriptionpd.DataFrame
Constraint evaluation for each of the constraints
Source code inbofire/data_models/domain/constraints.py
def __call__(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Numerically evaluate all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint on\n\n Returns:\n pd.DataFrame: Constraint evaluation for each of the constraints\n \"\"\"\n return pd.concat([c(experiments) for c in self.constraints], axis=1)\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.get","title":"get(self, includes=<class 'bofire.data_models.constraints.constraint.Constraint'>, excludes=None, exact=False)
","text":"Get constraints of the domain
Parameters:
Name Type Description Defaultincludes
Union[Type[~CIncludes], Sequence[Type[~CIncludes]]]
Constraint class or list of specific constraint classes to be returned. Defaults to Constraint.
<class 'bofire.data_models.constraints.constraint.Constraint'>
excludes
Union[Type[~CExcludes], List[Type[~CExcludes]]]
Constraint class or list of specific constraint classes to be excluded from the return. Defaults to None.
None
exact
bool
Boolean to distinguish if only the exact class listed in includes and no subclasses inherenting from this class shall be returned. Defaults to False.
False
Returns:
Type DescriptionConstraints
constraints in the domain fitting to the passed requirements.
Source code inbofire/data_models/domain/constraints.py
def get(\n self,\n includes: Union[Type[CIncludes], Sequence[Type[CIncludes]]] = Constraint,\n excludes: Optional[Union[Type[CExcludes], List[Type[CExcludes]]]] = None,\n exact: bool = False,\n) -> \"Constraints[CIncludes]\":\n \"\"\"Get constraints of the domain\n\n Args:\n includes: Constraint class or list of specific constraint classes to be returned. Defaults to Constraint.\n excludes: Constraint class or list of specific constraint classes to be excluded from the return. Defaults to None.\n exact: Boolean to distinguish if only the exact class listed in includes and no subclasses inherenting from this class shall be returned. Defaults to False.\n\n Returns:\n Constraints: constraints in the domain fitting to the passed requirements.\n \"\"\"\n return Constraints(\n constraints=filter_by_class(\n self.constraints,\n includes=includes,\n excludes=excludes,\n exact=exact,\n )\n )\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.get_reps_df","title":"get_reps_df(self)
","text":"Provides a tabular overwiev of all constraints within the domain
Returns:
Type Descriptionpd.DataFrame
DataFrame listing all constraints of the domain with a description
Source code inbofire/data_models/domain/constraints.py
def get_reps_df(self):\n \"\"\"Provides a tabular overwiev of all constraints within the domain\n\n Returns:\n pd.DataFrame: DataFrame listing all constraints of the domain with a description\n \"\"\"\n df = pd.DataFrame(\n index=range(len(self.constraints)),\n columns=[\"Type\", \"Description\"],\n data={\n \"Type\": [feat.__class__.__name__ for feat in self.get(Constraint)],\n \"Description\": [\n constraint.__str__() for constraint in self.get(Constraint)\n ],\n },\n )\n return df\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.is_fulfilled","title":"is_fulfilled(self, experiments, tol=1e-06)
","text":"Check if all constraints are fulfilled on all rows of the provided dataframe
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe with data, the constraint validity should be tested on
requiredtol
float
tolerance parameter. A constraint is considered as not fulfilled if the violation is larger than tol. Defaults to 0.
1e-06
Returns:
Type DescriptionBoolean
True if all constraints are fulfilled for all rows, false if not
Source code inbofire/data_models/domain/constraints.py
def is_fulfilled(self, experiments: pd.DataFrame, tol: float = 1e-6) -> pd.Series:\n \"\"\"Check if all constraints are fulfilled on all rows of the provided dataframe\n\n Args:\n experiments (pd.DataFrame): Dataframe with data, the constraint validity should be tested on\n tol (float, optional): tolerance parameter. A constraint is considered as not fulfilled if\n the violation is larger than tol. Defaults to 0.\n\n Returns:\n Boolean: True if all constraints are fulfilled for all rows, false if not\n \"\"\"\n if len(self.constraints) == 0:\n return pd.Series([True] * len(experiments), index=experiments.index)\n return (\n pd.concat(\n [c.is_fulfilled(experiments, tol) for c in self.constraints], axis=1\n )\n .fillna(True)\n .all(axis=1)\n )\n
"},{"location":"ref-constraints/#bofire.data_models.domain.constraints.Constraints.jacobian","title":"jacobian(self, experiments)
","text":"Numerically evaluate the jacobians of all constraints
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
data to evaluate the constraint jacobians on
requiredReturns:
Type Descriptionlist
A list containing the jacobians as pd.DataFrames
Source code inbofire/data_models/domain/constraints.py
def jacobian(self, experiments: pd.DataFrame) -> list:\n \"\"\"Numerically evaluate the jacobians of all constraints\n\n Args:\n experiments (pd.DataFrame): data to evaluate the constraint jacobians on\n\n Returns:\n list: A list containing the jacobians as pd.DataFrames\n \"\"\"\n return [c.jacobian(experiments) for c in self.constraints]\n
"},{"location":"ref-domain-util/","title":"Domain","text":""},{"location":"ref-domain-util/#bofire.utils.cheminformatics","title":"cheminformatics
","text":""},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2fingerprints","title":"smiles2fingerprints(smiles, bond_radius=5, n_bits=2048)
","text":"Transforms a list of smiles to an array of morgan fingerprints.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredbond_radius
int
Bond radius to use. Defaults to 5.
5
n_bits
int
Number of bits. Defaults to 2048.
2048
Returns:
Type Descriptionnp.ndarray
Numpy array holding the fingerprints
Source code inbofire/utils/cheminformatics.py
def smiles2fingerprints(\n smiles: List[str], bond_radius: int = 5, n_bits: int = 2048\n) -> np.ndarray:\n \"\"\"Transforms a list of smiles to an array of morgan fingerprints.\n\n Args:\n smiles (List[str]): List of smiles\n bond_radius (int, optional): Bond radius to use. Defaults to 5.\n n_bits (int, optional): Number of bits. Defaults to 2048.\n\n Returns:\n np.ndarray: Numpy array holding the fingerprints\n \"\"\"\n rdkit_mols = [smiles2mol(m) for m in smiles]\n fps = [\n AllChem.GetMorganFingerprintAsBitVect( # type: ignore\n mol, radius=bond_radius, nBits=n_bits\n )\n for mol in rdkit_mols\n ]\n\n return np.asarray(fps)\n
"},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2fragments","title":"smiles2fragments(smiles, fragments_list=None)
","text":"Transforms smiles to an array of fragments.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredReturns:
Type Descriptionnp.ndarray
Array holding the fragment information.
Source code inbofire/utils/cheminformatics.py
def smiles2fragments(\n smiles: List[str], fragments_list: Optional[List[str]] = None\n) -> np.ndarray:\n \"\"\"Transforms smiles to an array of fragments.\n\n Args:\n smiles (List[str]): List of smiles\n\n Returns:\n np.ndarray: Array holding the fragment information.\n \"\"\"\n rdkit_fragment_list = [\n item for item in Descriptors.descList if item[0].startswith(\"fr_\")\n ]\n if fragments_list is None:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list}\n else:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list if d[0] in fragments_list}\n\n frags = np.zeros((len(smiles), len(fragments)))\n for i, smi in enumerate(smiles):\n mol = smiles2mol(smi)\n features = [fragments[d](mol) for d in fragments]\n frags[i, :] = features\n\n return frags\n
"},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2mol","title":"smiles2mol(smiles)
","text":"Transforms a smiles string to an rdkit mol object.
Parameters:
Name Type Description Defaultsmiles
str
Smiles string.
requiredExceptions:
Type DescriptionValueError
If string is not a valid smiles.
Returns:
Type Descriptionrdkit.Mol
rdkit.mol object
Source code inbofire/utils/cheminformatics.py
def smiles2mol(smiles: str):\n \"\"\"Transforms a smiles string to an rdkit mol object.\n\n Args:\n smiles (str): Smiles string.\n\n Raises:\n ValueError: If string is not a valid smiles.\n\n Returns:\n rdkit.Mol: rdkit.mol object\n \"\"\"\n mol = MolFromSmiles(smiles)\n if mol is None:\n raise ValueError(f\"{smiles} is not a valid smiles string.\")\n return mol\n
"},{"location":"ref-domain-util/#bofire.utils.cheminformatics.smiles2mordred","title":"smiles2mordred(smiles, descriptors_list)
","text":"Transforms list of smiles to mordred moelcular descriptors.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requireddescriptors_list
List[str]
List of desired mordred descriptors
requiredReturns:
Type Descriptionnp.ndarray
Array holding the mordred moelcular descriptors.
Source code inbofire/utils/cheminformatics.py
def smiles2mordred(smiles: List[str], descriptors_list: List[str]) -> np.ndarray:\n \"\"\"Transforms list of smiles to mordred moelcular descriptors.\n\n Args:\n smiles (List[str]): List of smiles\n descriptors_list (List[str]): List of desired mordred descriptors\n\n Returns:\n np.ndarray: Array holding the mordred moelcular descriptors.\n \"\"\"\n mols = [smiles2mol(smi) for smi in smiles]\n\n calc = Calculator(descriptors, ignore_3D=True)\n calc.descriptors = [d for d in calc.descriptors if str(d) in descriptors_list]\n\n descriptors_df = calc.pandas(mols)\n nan_list = [\n pd.to_numeric(descriptors_df[col], errors=\"coerce\").isnull().values.any()\n for col in descriptors_df.columns\n ]\n if any(nan_list):\n raise ValueError(\n f\"Found NaN values in descriptors {list(descriptors_df.columns[nan_list])}\"\n )\n\n return descriptors_df.astype(float).values\n
"},{"location":"ref-domain-util/#bofire.utils.doe","title":"doe
","text":""},{"location":"ref-domain-util/#bofire.utils.doe.ff2n","title":"ff2n(n_factors)
","text":"Computes the full factorial design for a given number of factors.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredReturns:
Type Descriptionndarray
The full factorial design.
Source code inbofire/utils/doe.py
def ff2n(n_factors: int) -> np.ndarray:\n \"\"\"Computes the full factorial design for a given number of factors.\n\n Args:\n n_factors: The number of factors.\n\n Returns:\n The full factorial design.\n \"\"\"\n return np.array(list(itertools.product([-1, 1], repeat=n_factors)))\n
"},{"location":"ref-domain-util/#bofire.utils.doe.fracfact","title":"fracfact(gen)
","text":"Computes the fractional factorial design for a given generator.
Parameters:
Name Type Description Defaultgen
The generator.
requiredReturns:
Type Descriptionndarray
The fractional factorial design.
Source code inbofire/utils/doe.py
def fracfact(gen) -> np.ndarray:\n \"\"\"Computes the fractional factorial design for a given generator.\n\n Args:\n gen: The generator.\n\n Returns:\n The fractional factorial design.\n \"\"\"\n gen = validate_generator(n_factors=gen.count(\" \") + 1, generator=gen)\n\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", gen) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # Check if there are \"-\" operators in gen\n idx_negative = [\n i for i, item in enumerate(gen.split(\" \")) if item[0] == \"-\"\n ] # remove empty strings\n\n # Fill in design with two level factorial design\n H1 = ff2n(len(idx_main))\n H = np.zeros((H1.shape[0], len(lengthes)))\n H[:, idx_main] = H1\n\n # Recognize combinations and fill in the rest of matrix H2 with the proper\n # products\n for k in idx_combi:\n # For lowercase letters\n xx = np.array([ord(c) for c in generators[k]]) - 97\n\n H[:, k] = np.prod(H1[:, xx], axis=1)\n\n # Update design if gen includes \"-\" operator\n if len(idx_negative) > 0:\n H[:, idx_negative] *= -1\n\n # Return the fractional factorial design\n return H\n
"},{"location":"ref-domain-util/#bofire.utils.doe.get_alias_structure","title":"get_alias_structure(gen, order=4)
","text":"Computes the alias structure of the design matrix. Works only for generators with positive signs.
Parameters:
Name Type Description Defaultgen
str
The generator.
requiredorder
int
The order up to wich the alias structure should be calculated. Defaults to 4.
4
Returns:
Type DescriptionList[str]
The alias structure of the design matrix.
Source code inbofire/utils/doe.py
def get_alias_structure(gen: str, order: int = 4) -> List[str]:\n \"\"\"Computes the alias structure of the design matrix. Works only for generators\n with positive signs.\n\n Args:\n gen: The generator.\n order: The order up to wich the alias structure should be calculated. Defaults to 4.\n\n Returns:\n The alias structure of the design matrix.\n \"\"\"\n design = fracfact(gen)\n\n n_experiments, n_factors = design.shape\n\n all_names = string.ascii_lowercase + \"I\"\n factors = range(n_factors)\n all_combinations = itertools.chain.from_iterable(\n (\n itertools.combinations(factors, n)\n for n in range(1, min(n_factors, order) + 1)\n )\n )\n aliases = {n_experiments * \"+\": [(26,)]} # 26 is mapped to I\n\n for combination in all_combinations:\n # positive sign\n contrast = np.prod(\n design[:, combination], axis=1\n ) # this is the product of the combination\n scontrast = \"\".join(np.where(contrast == 1, \"+\", \"-\").tolist())\n aliases[scontrast] = aliases.get(scontrast, [])\n aliases[scontrast].append(combination) # type: ignore\n\n aliases_list = []\n for alias in aliases.values():\n aliases_list.append(\n sorted(alias, key=lambda a: (len(a), a))\n ) # sort by length and then by the combination\n aliases_list = sorted(\n aliases_list, key=lambda list: ([len(a) for a in list], list)\n ) # sort by the length of the alias\n\n aliases_readable = []\n\n for alias in aliases_list:\n aliases_readable.append(\n \" = \".join([\"\".join([all_names[f] for f in a]) for a in alias])\n )\n\n return aliases_readable\n
"},{"location":"ref-domain-util/#bofire.utils.doe.get_confounding_matrix","title":"get_confounding_matrix(inputs, design, powers=None, interactions=None)
","text":"Analyzes the confounding of a design and returns the confounding matrix.
Only takes continuous features into account.
Parameters:
Name Type Description Defaultinputs
Inputs
Input features.
requireddesign
pd.DataFrame
Design matrix.
requiredpowers
List[int]
List of powers of the individual factors/features that should be considered. Integers has to be larger than 1. Defaults to [].
None
interactions
List[int]
List with interaction levels to be considered. Integers has to be larger than 1. Defaults to [2].
None
Returns:
Type Description_type_
description
Source code inbofire/utils/doe.py
def get_confounding_matrix(\n inputs: Inputs,\n design: pd.DataFrame,\n powers: Optional[List[int]] = None,\n interactions: Optional[List[int]] = None,\n):\n \"\"\"Analyzes the confounding of a design and returns the confounding matrix.\n\n Only takes continuous features into account.\n\n Args:\n inputs (Inputs): Input features.\n design (pd.DataFrame): Design matrix.\n powers (List[int], optional): List of powers of the individual factors/features that should be considered.\n Integers has to be larger than 1. Defaults to [].\n interactions (List[int], optional): List with interaction levels to be considered.\n Integers has to be larger than 1. Defaults to [2].\n\n Returns:\n _type_: _description_\n \"\"\"\n from sklearn.preprocessing import MinMaxScaler\n\n if len(inputs.get(CategoricalInput)) > 0:\n warnings.warn(\"Categorical input features will be ignored.\")\n\n keys = inputs.get_keys(ContinuousInput)\n scaler = MinMaxScaler(feature_range=(-1, 1))\n scaled_design = pd.DataFrame(\n data=scaler.fit_transform(design[keys]),\n columns=keys,\n )\n\n # add powers\n if powers is not None:\n for p in powers:\n assert p > 1, \"Power has to be at least of degree two.\"\n for key in keys:\n scaled_design[f\"{key}**{p}\"] = scaled_design[key] ** p\n\n # add interactions\n if interactions is None:\n interactions = [2]\n\n for i in interactions:\n assert i > 1, \"Interaction has to be at least of degree two.\"\n assert i < len(keys) + 1, f\"Interaction has to be smaller than {len(keys)+1}.\"\n for combi in itertools.combinations(keys, i):\n scaled_design[\":\".join(combi)] = scaled_design[list(combi)].prod(axis=1)\n\n return scaled_design.corr()\n
"},{"location":"ref-domain-util/#bofire.utils.doe.get_generator","title":"get_generator(n_factors, n_generators)
","text":"Computes a generator for a given number of factors and generators.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredn_generators
int
The number of generators.
requiredReturns:
Type Descriptionstr
The generator.
Source code inbofire/utils/doe.py
def get_generator(n_factors: int, n_generators: int) -> str:\n \"\"\"Computes a generator for a given number of factors and generators.\n\n Args:\n n_factors: The number of factors.\n n_generators: The number of generators.\n\n Returns:\n The generator.\n \"\"\"\n if n_generators == 0:\n return \" \".join(list(string.ascii_lowercase[:n_factors]))\n n_base_factors = n_factors - n_generators\n if n_generators == 1:\n if n_base_factors == 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(\n list(string.ascii_lowercase[:n_base_factors])\n + [string.ascii_lowercase[:n_base_factors]]\n )\n n_base_factors = n_factors - n_generators\n if n_base_factors - 1 < 2:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n generators = [\n \"\".join(i)\n for i in (\n itertools.combinations(\n string.ascii_lowercase[:n_base_factors], n_base_factors - 1\n )\n )\n ]\n if len(generators) > n_generators:\n generators = generators[:n_generators]\n elif (n_generators - len(generators) == 1) and (n_base_factors > 1):\n generators += [string.ascii_lowercase[:n_base_factors]]\n elif n_generators - len(generators) >= 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(list(string.ascii_lowercase[:n_base_factors]) + generators)\n
"},{"location":"ref-domain-util/#bofire.utils.doe.validate_generator","title":"validate_generator(n_factors, generator)
","text":"Validates the generator and thows an error if it is not valid.
Source code inbofire/utils/doe.py
def validate_generator(n_factors: int, generator: str) -> str:\n \"\"\"Validates the generator and thows an error if it is not valid.\"\"\"\n\n if len(generator.split(\" \")) != n_factors:\n raise ValueError(\"Generator does not match the number of factors.\")\n # clean it and transform it into a list\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", generator) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n if len(idx_main) == 0:\n raise ValueError(\"At least one unconfounded main factor is needed.\")\n\n # Check that single letters (main factors) are unique\n if len(idx_main) != len({generators[i] for i in idx_main}):\n raise ValueError(\"Main factors are confounded with each other.\")\n\n # Check that single letters (main factors) follow the alphabet\n if (\n \"\".join(sorted([generators[i] for i in idx_main]))\n != string.ascii_lowercase[: len(idx_main)]\n ):\n raise ValueError(\n f'Use the letters `{\" \".join(string.ascii_lowercase[: len(idx_main)])}` for the main factors.'\n )\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # check that main factors come before combinations\n if min(idx_combi) > max(idx_main):\n raise ValueError(\"Main factors have to come before combinations.\")\n\n # Check that letter combinations are unique\n if len(idx_combi) != len({generators[i] for i in idx_combi}):\n raise ValueError(\"Generators are not unique.\")\n\n # Check that only letters are used in the combinations that are also single letters (main factors)\n if not all(\n set(item).issubset({generators[i] for i in idx_main})\n for item in [generators[i] for i in idx_combi]\n ):\n raise ValueError(\"Generators are not valid.\")\n\n return generator\n
"},{"location":"ref-domain-util/#bofire.utils.multiobjective","title":"multiobjective
","text":""},{"location":"ref-domain-util/#bofire.utils.multiobjective.get_ref_point_mask","title":"get_ref_point_mask(domain, output_feature_keys=None)
","text":"Method to get a mask for the reference points taking into account if we want to maximize or minimize an objective. In case it is maximize the value in the mask is 1, in case we want to minimize it is -1.
Parameters:
Name Type Description Defaultdomain
Domain
Domain for which the mask should be generated.
requiredoutput_feature_keys
Optional[list]
Name of output feature keys that should be considered in the mask. Defaults to None.
None
Returns:
Type Descriptionnp.ndarray
description
Source code inbofire/utils/multiobjective.py
def get_ref_point_mask(\n domain: Domain, output_feature_keys: Optional[list] = None\n) -> np.ndarray:\n \"\"\"Method to get a mask for the reference points taking into account if we\n want to maximize or minimize an objective. In case it is maximize the value\n in the mask is 1, in case we want to minimize it is -1.\n\n Args:\n domain (Domain): Domain for which the mask should be generated.\n output_feature_keys (Optional[list], optional): Name of output feature keys\n that should be considered in the mask. Defaults to None.\n\n Returns:\n np.ndarray: _description_\n \"\"\"\n if output_feature_keys is None:\n output_feature_keys = domain.outputs.get_keys_by_objective(\n includes=[MaximizeObjective, MinimizeObjective, CloseToTargetObjective]\n )\n if len(output_feature_keys) < 2:\n raise ValueError(\"At least two output features have to be provided.\")\n mask = []\n for key in output_feature_keys:\n feat = domain.outputs.get_by_key(key)\n if isinstance(feat.objective, MaximizeObjective): # type: ignore\n mask.append(1.0)\n elif isinstance(feat.objective, MinimizeObjective): # type: ignore\n mask.append(-1.0)\n elif isinstance(feat.objective, CloseToTargetObjective): # type: ignore\n mask.append(-1.0)\n else:\n raise ValueError(\n \"Only `MaximizeObjective` and `MinimizeObjective` supported\"\n )\n return np.array(mask)\n
"},{"location":"ref-domain-util/#bofire.utils.naming_conventions","title":"naming_conventions
","text":""},{"location":"ref-domain-util/#bofire.utils.naming_conventions.get_column_names","title":"get_column_names(outputs)
","text":"Specifies column names for given Outputs type.
Parameters:
Name Type Description Defaultoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type DescriptionTuple[List[str], List[str]]
A tuple containing the prediction column names and the standard deviation column names
Source code inbofire/utils/naming_conventions.py
def get_column_names(outputs: Outputs) -> Tuple[List[str], List[str]]:\n \"\"\"\n Specifies column names for given Outputs type.\n\n Args:\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n Tuple[List[str], List[str]]: A tuple containing the prediction column names and the standard deviation column names\n \"\"\"\n pred_cols, sd_cols = [], []\n for featkey in outputs.get_keys(CategoricalOutput): # type: ignore\n pred_cols = pred_cols + [\n f\"{featkey}_{cat}_prob\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n sd_cols = sd_cols + [\n f\"{featkey}_{cat}_sd\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n for featkey in outputs.get_keys(ContinuousOutput): # type: ignore\n pred_cols = pred_cols + [f\"{featkey}_pred\"]\n sd_cols = sd_cols + [f\"{featkey}_sd\"]\n\n return pred_cols, sd_cols\n
"},{"location":"ref-domain-util/#bofire.utils.naming_conventions.postprocess_categorical_predictions","title":"postprocess_categorical_predictions(predictions, outputs)
","text":"Postprocess categorical predictions by finding the maximum probability location
Parameters:
Name Type Description Defaultpredictions
pd.DataFrame
The dataframe containing the predictions.
requiredoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type Descriptionpredictions (pd.DataFrame)
The (potentially modified) original dataframe with categorical predictions added
Source code inbofire/utils/naming_conventions.py
def postprocess_categorical_predictions(predictions: pd.DataFrame, outputs: Outputs) -> pd.DataFrame: # type: ignore\n \"\"\"\n Postprocess categorical predictions by finding the maximum probability location\n\n Args:\n predictions (pd.DataFrame): The dataframe containing the predictions.\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n predictions (pd.DataFrame): The (potentially modified) original dataframe with categorical predictions added\n \"\"\"\n for feat in outputs.get():\n if isinstance(feat, CategoricalOutput): # type: ignore\n predictions.insert(\n loc=0,\n column=f\"{feat.key}_pred\",\n value=predictions.filter(regex=f\"{feat.key}(.*)_prob\")\n .idxmax(1)\n .str.replace(f\"{feat.key}_\", \"\")\n .str.replace(\"_prob\", \"\")\n .values,\n )\n predictions.insert(\n loc=1,\n column=f\"{feat.key}_sd\",\n value=0.0,\n )\n return predictions\n
"},{"location":"ref-domain-util/#bofire.utils.reduce","title":"reduce
","text":""},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform","title":" AffineTransform
","text":"Class to switch back and forth from the reduced to the original domain.
Source code inbofire/utils/reduce.py
class AffineTransform:\n \"\"\"Class to switch back and forth from the reduced to the original domain.\"\"\"\n\n def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n\n def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n\n def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform.__init__","title":"__init__(self, equalities)
special
","text":"Initializes a AffineTransformation
object.
Parameters:
Name Type Description Defaultequalities
List[Tuple[str,List[str],List[float]]]
List of equalities. Every equality is defined as a tuple, in which the first entry is the key of the reduced feature, the second one is a list of feature keys that can be used to compute the feature and the third list of floats are the corresponding coefficients.
required Source code inbofire/utils/reduce.py
def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform.augment_data","title":"augment_data(self, data)
","text":"Restore the eliminated features in a dataframe
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe that should be restored.
requiredReturns:
Type Descriptionpd.DataFrame
Restored dataframe
Source code inbofire/utils/reduce.py
def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.AffineTransform.drop_data","title":"drop_data(self, data)
","text":"Drop eliminated features from a dataframe.
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe with features to be dropped.
requiredReturns:
Type Descriptionpd.DataFrame
Reduced dataframe.
Source code inbofire/utils/reduce.py
def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.adjust_boundary","title":"adjust_boundary(feature, coef, rhs)
","text":"Adjusts the boundaries of a feature.
Parameters:
Name Type Description Defaultfeature
ContinuousInput
Feature to be adjusted.
requiredcoef
float
Coefficient.
requiredrhs
float
Right-hand-side of the constraint.
required Source code inbofire/utils/reduce.py
def adjust_boundary(feature: ContinuousInput, coef: float, rhs: float):\n \"\"\"Adjusts the boundaries of a feature.\n\n Args:\n feature (ContinuousInput): Feature to be adjusted.\n coef (float): Coefficient.\n rhs (float): Right-hand-side of the constraint.\n \"\"\"\n boundary = rhs / coef\n if coef > 0:\n if boundary > feature.lower_bound:\n feature.bounds = (boundary, feature.upper_bound)\n else:\n if boundary < feature.upper_bound:\n feature.bounds = (feature.lower_bound, boundary)\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.check_domain_for_reduction","title":"check_domain_for_reduction(domain)
","text":"Check if the reduction can be applied or if a trivial case is present.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be checked.
requiredReturns:
Type Descriptionbool
True if reducable, else False.
Source code inbofire/utils/reduce.py
def check_domain_for_reduction(domain: Domain) -> bool:\n \"\"\"Check if the reduction can be applied or if a trivial case is present.\n\n Args:\n domain (Domain): Domain to be checked.\n Returns:\n bool: True if reducable, else False.\n \"\"\"\n # are there any constraints?\n if len(domain.constraints) == 0:\n return False\n\n # are there any linear equality constraints?\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n if len(linear_equalities) == 0:\n return False\n\n # are there no NChooseKConstraint constraints?\n if len(domain.constraints.get([NChooseKConstraint])) > 0:\n return False\n\n # are there continuous inputs\n continuous_inputs = domain.inputs.get(ContinuousInput)\n if len(continuous_inputs) == 0:\n return False\n\n # check that equality constraints only contain continuous inputs\n for c in linear_equalities:\n assert isinstance(c, LinearConstraint)\n for feat in c.features:\n if feat not in domain.inputs.get_keys(ContinuousInput):\n return False\n return True\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.check_existence_of_solution","title":"check_existence_of_solution(A_aug)
","text":"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.
Source code inbofire/utils/reduce.py
def check_existence_of_solution(A_aug):\n \"\"\"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.\"\"\"\n A = A_aug[:, :-1]\n b = A_aug[:, -1]\n len_inputs = np.shape(A)[1]\n\n # catch special cases\n rk_A_aug = np.linalg.matrix_rank(A_aug)\n rk_A = np.linalg.matrix_rank(A)\n\n if rk_A == rk_A_aug:\n if rk_A < len_inputs:\n return # all good\n else:\n x = np.linalg.solve(A, b)\n raise Exception(\n f\"There is a unique solution x for the linear equality constraints: x={x}\"\n )\n elif rk_A < rk_A_aug:\n raise Exception(\n \"There is no solution fulfilling the linear equality constraints.\"\n )\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.reduce_domain","title":"reduce_domain(domain)
","text":"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be reduced.
requiredReturns:
Type DescriptionTuple[Domain, AffineTransform]
reduced domain and the according transformation to switch between the reduced and orginal domain.
Source code inbofire/utils/reduce.py
def reduce_domain(domain: Domain) -> Tuple[Domain, AffineTransform]:\n \"\"\"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.\n\n Args:\n domain (Domain): Domain to be reduced.\n\n Returns:\n Tuple[Domain, AffineTransform]: reduced domain and the according transformation to switch between the\n reduced and orginal domain.\n \"\"\"\n # check if the domain can be reduced\n if not check_domain_for_reduction(domain):\n return domain, AffineTransform([])\n\n # find linear equality constraints\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n other_constraints = domain.constraints.get(\n Constraint, excludes=[LinearEqualityConstraint]\n )\n\n # only consider continuous inputs\n continuous_inputs = [\n cast(ContinuousInput, f) for f in domain.inputs.get(ContinuousInput)\n ]\n other_inputs = domain.inputs.get(Input, excludes=[ContinuousInput])\n\n # assemble Matrix A from equality constraints\n N = len(linear_equalities)\n M = len(continuous_inputs) + 1\n names = np.concatenate(([feat.key for feat in continuous_inputs], [\"rhs\"]))\n\n A_aug = pd.DataFrame(data=np.zeros(shape=(N, M)), columns=names)\n\n for i in range(len(linear_equalities)):\n c = linear_equalities[i]\n assert isinstance(c, LinearEqualityConstraint)\n A_aug.loc[i, c.features] = c.coefficients # type: ignore\n A_aug.loc[i, \"rhs\"] = c.rhs\n A_aug = A_aug.values\n\n # catch special cases\n check_existence_of_solution(A_aug)\n\n # bring A_aug to reduced row-echelon form\n A_aug_rref, pivots = rref(A_aug)\n pivots = np.array(pivots)\n A_aug_rref = np.array(A_aug_rref).astype(np.float64)\n\n # formulate box bounds as linear inequality constraints in matrix form\n B = np.zeros(shape=(2 * (M - 1), M))\n B[: M - 1, : M - 1] = np.eye(M - 1)\n B[M - 1 :, : M - 1] = -np.eye(M - 1)\n\n B[: M - 1, -1] = np.array([feat.upper_bound for feat in continuous_inputs])\n B[M - 1 :, -1] = -1.0 * np.array([feat.lower_bound for feat in continuous_inputs])\n\n # eliminate columns with pivot element\n for i in range(len(pivots)):\n p = pivots[i]\n B[p, :] -= A_aug_rref[i, :]\n B[p + M - 1, :] += A_aug_rref[i, :]\n\n # build up reduced domain\n _domain = Domain.model_construct(\n # _fields_set = {\"inputs\", \"outputs\", \"constraints\"}\n inputs=deepcopy(other_inputs),\n outputs=deepcopy(domain.outputs),\n constraints=deepcopy(other_constraints),\n )\n new_inputs = [\n deepcopy(feat) for i, feat in enumerate(continuous_inputs) if i not in pivots\n ]\n all_inputs = _domain.inputs + new_inputs\n assert isinstance(all_inputs, Inputs)\n _domain.inputs.features = all_inputs.features\n\n constraints: List[AnyConstraint] = []\n for i in pivots:\n # reduce equation system of upper bounds\n ind = np.where(B[i, :-1] != 0)[0]\n if len(ind) > 0 and B[i, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i, ind]).tolist(),\n rhs=B[i, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(feat, (-1.0 * B[i, ind])[0], B[i, -1] * -1.0)\n else:\n if B[i, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n # reduce equation system of lower bounds\n ind = np.where(B[i + M - 1, :-1] != 0)[0]\n if len(ind) > 0 and B[i + M - 1, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i + M - 1, ind]).tolist(),\n rhs=B[i + M - 1, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(\n feat,\n (-1.0 * B[i + M - 1, ind])[0],\n B[i + M - 1, -1] * -1.0,\n )\n else:\n if B[i + M - 1, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n if len(constraints) > 0:\n _domain.constraints.constraints = _domain.constraints.constraints + constraints # type: ignore\n\n # assemble equalities\n _equalities = []\n for i in range(len(pivots)):\n name_lhs = names[pivots[i]]\n names_rhs = []\n coeffs = []\n\n for j in range(len(names) - 1):\n if A_aug_rref[i, j] != 0 and j != pivots[i]:\n coeffs.append(-A_aug_rref[i, j])\n names_rhs.append(names[j])\n\n coeffs.append(A_aug_rref[i, -1])\n\n _equalities.append((name_lhs, names_rhs, coeffs))\n\n trafo = AffineTransform(_equalities)\n # remove remaining dependencies of eliminated inputs from the problem\n _domain = remove_eliminated_inputs(_domain, trafo)\n return _domain, trafo\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.remove_eliminated_inputs","title":"remove_eliminated_inputs(domain, transform)
","text":"Eliminates remaining occurences of eliminated inputs in linear constraints.
Parameters:
Name Type Description Defaultdomain
Domain
Domain in which the linear constraints should be purged.
requiredtransform
AffineTransform
Affine transformation object that defines the obsolete features.
requiredExceptions:
Type DescriptionValueError
If feature occurs in a constraint different from a linear one.
Returns:
Type DescriptionDomain
Purged domain.
Source code inbofire/utils/reduce.py
def remove_eliminated_inputs(domain: Domain, transform: AffineTransform) -> Domain:\n \"\"\"Eliminates remaining occurences of eliminated inputs in linear constraints.\n\n Args:\n domain (Domain): Domain in which the linear constraints should be purged.\n transform (AffineTransform): Affine transformation object that defines the obsolete features.\n\n Raises:\n ValueError: If feature occurs in a constraint different from a linear one.\n\n Returns:\n Domain: Purged domain.\n \"\"\"\n inputs_names = domain.inputs.get_keys()\n M = len(inputs_names)\n\n # write the equalities for the backtransformation into one matrix\n inputs_dict = {inputs_names[i]: i for i in range(M)}\n\n # build up dict from domain.equalities e.g. {\"xi1\": [coeff(xj1), ..., coeff(xjn)], ... \"xik\":...}\n coeffs_dict = {}\n for e in transform.equalities:\n coeffs = np.zeros(M + 1)\n for j, name in enumerate(e[1]):\n coeffs[inputs_dict[name]] = e[2][j]\n coeffs[-1] = e[2][-1]\n coeffs_dict[e[0]] = coeffs\n\n constraints = []\n for c in domain.constraints.get():\n # Nonlinear constraints not supported\n if not isinstance(c, LinearConstraint):\n raise ValueError(\n \"Elimination of variables is only supported for LinearEquality and LinearInequality constraints.\"\n )\n\n # no changes, if the constraint does not contain eliminated inputs\n elif all(name in inputs_names for name in c.features):\n constraints.append(c)\n\n # remove inputs from the constraint that were eliminated from the inputs before\n else:\n totally_removed = False\n _features = np.array(inputs_names)\n _rhs = c.rhs\n\n # create new lhs and rhs from the old one and knowledge from problem._equalities\n _coefficients = np.zeros(M)\n for j, name in enumerate(c.features):\n if name in inputs_names:\n _coefficients[inputs_dict[name]] += c.coefficients[j]\n else:\n _coefficients += c.coefficients[j] * coeffs_dict[name][:-1]\n _rhs -= c.coefficients[j] * coeffs_dict[name][-1]\n\n _features = _features[np.abs(_coefficients) > 1e-16]\n _coefficients = _coefficients[np.abs(_coefficients) > 1e-16]\n _c = None\n if isinstance(c, LinearEqualityConstraint):\n if len(_features) > 1:\n _c = LinearEqualityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat: ContinuousInput = ContinuousInput(\n **domain.inputs.get_by_key(_features[0]).model_dump()\n )\n feat.bounds = (_coefficients[0], _coefficients[0])\n totally_removed = True\n else:\n if len(_features) > 1:\n _c = LinearInequalityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat = cast(ContinuousInput, domain.inputs.get_by_key(_features[0]))\n adjust_boundary(feat, _coefficients[0], _rhs)\n totally_removed = True\n\n # check if constraint is always fulfilled/not fulfilled\n if not totally_removed:\n assert _c is not None\n if len(_c.features) == 0 and _c.rhs >= 0:\n pass\n elif len(_c.features) == 0 and _c.rhs < 0:\n raise Exception(\"Linear constraints cannot be fulfilled.\")\n elif np.isinf(_c.rhs):\n pass\n else:\n constraints.append(_c)\n domain.constraints = Constraints(constraints=constraints)\n return domain\n
"},{"location":"ref-domain-util/#bofire.utils.reduce.rref","title":"rref(A, tol=1e-08)
","text":"Computes the reduced row echelon form of a Matrix
Parameters:
Name Type Description DefaultA
ndarray
2d array representing a matrix.
requiredtol
float
tolerance for rounding to 0. Defaults to 1e-8.
1e-08
Returns:
Type DescriptionTuple[numpy.ndarray, List[int]]
(A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots is a numpy array containing the pivot columns of A_rref
Source code inbofire/utils/reduce.py
def rref(A: np.ndarray, tol: float = 1e-8) -> Tuple[np.ndarray, List[int]]:\n \"\"\"Computes the reduced row echelon form of a Matrix\n\n Args:\n A (ndarray): 2d array representing a matrix.\n tol (float, optional): tolerance for rounding to 0. Defaults to 1e-8.\n\n Returns:\n (A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots\n is a numpy array containing the pivot columns of A_rref\n \"\"\"\n A = np.array(A, dtype=np.float64)\n n, m = np.shape(A)\n\n col = 0\n row = 0\n pivots = []\n\n for col in range(m):\n # does a pivot element exist?\n if all(np.abs(A[row:, col]) < tol):\n pass\n # if yes: start elimination\n else:\n pivots.append(col)\n max_row = np.argmax(np.abs(A[row:, col])) + row\n # switch to most stable row\n A[[row, max_row], :] = A[[max_row, row], :] # type: ignore\n # normalize row\n A[row, :] /= A[row, col]\n # eliminate other elements from column\n for r in range(n):\n if r != row:\n A[r, :] -= A[r, col] / A[row, col] * A[row, :]\n row += 1\n\n prec = int(-np.log10(tol))\n return np.round(A, prec), pivots\n
"},{"location":"ref-domain-util/#bofire.utils.subdomain","title":"subdomain
","text":""},{"location":"ref-domain-util/#bofire.utils.subdomain.get_subdomain","title":"get_subdomain(domain, feature_keys)
","text":"removes all features not defined as argument creating a subdomain of the provided domain
Parameters:
Name Type Description Defaultdomain
Domain
the original domain wherefrom a subdomain should be created
requiredfeature_keys
List
List of features that shall be included in the subdomain
requiredExceptions:
Type DescriptionAssert
when in total less than 2 features are provided
ValueError
when a provided feature key is not present in the provided domain
Assert
when no output feature is provided
Assert
when no input feature is provided
ValueError
description
Returns:
Type DescriptionDomain
A new domain containing only parts of the original domain
Source code inbofire/utils/subdomain.py
def get_subdomain(\n domain: Domain,\n feature_keys: List,\n) -> Domain:\n \"\"\"removes all features not defined as argument creating a subdomain of the provided domain\n\n Args:\n domain (Domain): the original domain wherefrom a subdomain should be created\n feature_keys (List): List of features that shall be included in the subdomain\n\n Raises:\n Assert: when in total less than 2 features are provided\n ValueError: when a provided feature key is not present in the provided domain\n Assert: when no output feature is provided\n Assert: when no input feature is provided\n ValueError: _description_\n\n Returns:\n Domain: A new domain containing only parts of the original domain\n \"\"\"\n assert len(feature_keys) >= 2, \"At least two features have to be provided.\"\n outputs = []\n inputs = []\n for key in feature_keys:\n try:\n feat = (domain.inputs + domain.outputs).get_by_key(key)\n except KeyError:\n raise ValueError(f\"Feature {key} not present in domain.\")\n if isinstance(feat, Input):\n inputs.append(feat)\n else:\n outputs.append(feat)\n assert len(outputs) > 0, \"At least one output feature has to be provided.\"\n assert len(inputs) > 0, \"At least one input feature has to be provided.\"\n inputs = Inputs(features=inputs)\n outputs = Outputs(features=outputs)\n # loop over constraints and make sure that all features used in constraints are in the input_feature_keys\n for c in domain.constraints:\n for key in c.features: # type: ignore\n if key not in inputs.get_keys():\n raise ValueError(\n f\"Removed input feature {key} is used in a constraint.\"\n )\n subdomain = deepcopy(domain)\n subdomain.inputs = inputs\n subdomain.outputs = outputs\n return subdomain\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools","title":"torch_tools
","text":""},{"location":"ref-domain-util/#bofire.utils.torch_tools.constrained_objective2botorch","title":"constrained_objective2botorch(idx, objective, eps=1e-08)
","text":"Create a callable that can be used by botorch.utils.objective.apply_constraints
to setup ouput constrained optimizations.
Parameters:
Name Type Description Defaultidx
int
Index of the constraint objective in the list of outputs.
requiredobjective
BotorchConstrainedObjective
The objective that should be transformed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float], int]
List of callables that can be used by botorch for setting up the constrained objective, list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)
Source code inbofire/utils/torch_tools.py
def constrained_objective2botorch(\n idx: int, objective: ConstrainedObjective, eps: float = 1e-8\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float], int]:\n \"\"\"Create a callable that can be used by `botorch.utils.objective.apply_constraints`\n to setup ouput constrained optimizations.\n\n Args:\n idx (int): Index of the constraint objective in the list of outputs.\n objective (BotorchConstrainedObjective): The objective that should be transformed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float], int]: List of callables that can be used by botorch for setting up the constrained objective,\n list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)\n \"\"\"\n assert isinstance(\n objective, ConstrainedObjective\n ), \"Objective is not a `ConstrainedObjective`.\"\n if isinstance(objective, MaximizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp) * -1.0],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, MinimizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp)],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, TargetObjective):\n return (\n [\n lambda Z: (Z[..., idx] - (objective.target_value - objective.tolerance))\n * -1.0,\n lambda Z: (\n Z[..., idx] - (objective.target_value + objective.tolerance)\n ),\n ],\n [1.0 / objective.steepness, 1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, ConstrainedCategoricalObjective):\n # The output of a categorical objective has final dim `c` where `c` is number of classes\n # Pass in the expected acceptance probability and perform an inverse sigmoid to atain the original probabilities\n return (\n [\n lambda Z: torch.log(\n 1\n / torch.clamp(\n (\n Z[..., idx : idx + len(objective.desirability)]\n * torch.tensor(objective.desirability).to(**tkwargs)\n ).sum(-1),\n min=eps,\n max=1 - eps,\n )\n - 1,\n )\n ],\n [1.0],\n idx + len(objective.desirability),\n )\n else:\n raise ValueError(f\"Objective {objective.__class__.__name__} not known.\")\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_initial_conditions_generator","title":"get_initial_conditions_generator(strategy, transform_specs, ask_options=None, sequential=True)
","text":"Takes a strategy object and returns a callable which uses this strategy to return a generator callable which can be used in botorchs
gen_batch_initial_conditions` to generate samples.
Parameters:
Name Type Description Defaultstrategy
Strategy
Strategy that should be used to generate samples.
requiredtransform_specs
Dict
Dictionary indicating how the samples should be transformed.
requiredask_options
Dict
Dictionary of keyword arguments that are passed to the ask
method of the strategy. Defaults to {}.
None
sequential
bool
If True, samples for every q-batch are generate indepenent from each other. If False, the n x q
samples are generated at once.
True
Returns:
Type DescriptionCallable[[int, int, int], Tensor]
Callable that can be passed to batch_initial_conditions
.
bofire/utils/torch_tools.py
def get_initial_conditions_generator(\n strategy: Strategy,\n transform_specs: Dict,\n ask_options: Optional[Dict] = None,\n sequential: bool = True,\n) -> Callable[[int, int, int], Tensor]:\n \"\"\"Takes a strategy object and returns a callable which uses this\n strategy to return a generator callable which can be used in botorch`s\n `gen_batch_initial_conditions` to generate samples.\n\n Args:\n strategy (Strategy): Strategy that should be used to generate samples.\n transform_specs (Dict): Dictionary indicating how the samples should be\n transformed.\n ask_options (Dict, optional): Dictionary of keyword arguments that are\n passed to the `ask` method of the strategy. Defaults to {}.\n sequential (bool, optional): If True, samples for every q-batch are\n generate indepenent from each other. If False, the `n x q` samples\n are generated at once.\n\n Returns:\n Callable[[int, int, int], Tensor]: Callable that can be passed to\n `batch_initial_conditions`.\n \"\"\"\n if ask_options is None:\n ask_options = {}\n\n def generator(n: int, q: int, seed: int) -> Tensor:\n if sequential:\n initial_conditions = []\n for _ in range(n):\n candidates = strategy.ask(q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n # transform to tensor\n initial_conditions.append(\n torch.from_numpy(transformed_candidates.values).to(**tkwargs)\n )\n return torch.stack(initial_conditions, dim=0)\n else:\n candidates = strategy.ask(n * q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n return (\n torch.from_numpy(transformed_candidates.values)\n .to(**tkwargs)\n .reshape(n, q, transformed_candidates.shape[1])\n )\n\n return generator\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_interpoint_constraints","title":"get_interpoint_constraints(domain, n_candidates)
","text":"Converts interpoint equality constraints to linear equality constraints, that can be processed by botorch. For more information, see the docstring of optimize_acqf
in botorch (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredn_candidates
int
Number of candidates that should be requested.
requiredReturns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_interpoint_constraints(\n domain: Domain, n_candidates: int\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts interpoint equality constraints to linear equality constraints,\n that can be processed by botorch. For more information, see the docstring\n of `optimize_acqf` in botorch\n (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).\n\n Args:\n domain (Domain): Optimization problem definition.\n n_candidates (int): Number of candidates that should be requested.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists\n of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for constraint in domain.constraints.get(InterpointEqualityConstraint):\n assert isinstance(constraint, InterpointEqualityConstraint)\n coefficients = torch.tensor([1.0, -1.0]).to(**tkwargs)\n feat_idx = domain.inputs.get_keys(Input).index(constraint.feature)\n feat = domain.inputs.get_by_key(constraint.feature)\n assert isinstance(feat, ContinuousInput)\n if feat.is_fixed():\n continue\n multiplicity = constraint.multiplicity or n_candidates\n for i in range(math.ceil(n_candidates / multiplicity)):\n all_indices = torch.arange(\n i * multiplicity, min((i + 1) * multiplicity, n_candidates)\n )\n for k in range(len(all_indices) - 1):\n indices = torch.tensor(\n [[all_indices[0], feat_idx], [all_indices[k + 1], feat_idx]],\n dtype=torch.int64,\n )\n constraints.append((indices, coefficients, 0.0))\n return constraints\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_linear_constraints","title":"get_linear_constraints(domain, constraint, unit_scaled=False)
","text":"Converts linear constraints to the form required by BoTorch.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredconstraint
Union[Type[bofire.data_models.constraints.linear.LinearEqualityConstraint], Type[bofire.data_models.constraints.linear.LinearInequalityConstraint]]
Type of constraint that should be converted.
requiredunit_scaled
bool
If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.
False
Returns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_linear_constraints(\n domain: Domain,\n constraint: Union[Type[LinearEqualityConstraint], Type[LinearInequalityConstraint]],\n unit_scaled: bool = False,\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts linear constraints to the form required by BoTorch.\n\n Args:\n domain: Optimization problem definition.\n constraint: Type of constraint that should be converted.\n unit_scaled: If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for c in domain.constraints.get(constraint):\n indices = []\n coefficients = []\n lower = []\n upper = []\n rhs = 0.0\n for i, featkey in enumerate(c.features): # type: ignore\n idx = domain.inputs.get_keys(Input).index(featkey)\n feat = domain.inputs.get_by_key(featkey)\n if feat.is_fixed(): # type: ignore\n rhs -= feat.fixed_value()[0] * c.coefficients[i] # type: ignore\n else:\n lower.append(feat.lower_bound) # type: ignore\n upper.append(feat.upper_bound) # type: ignore\n indices.append(idx)\n coefficients.append(\n c.coefficients[i] # type: ignore\n ) # if unit_scaled == False else c_scaled.coefficients[i])\n if unit_scaled:\n lower = np.array(lower)\n upper = np.array(upper)\n s = upper - lower\n scaled_coefficients = s * np.array(coefficients)\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(scaled_coefficients).to(**tkwargs),\n -(rhs + c.rhs - np.sum(np.array(coefficients) * lower)), # type: ignore\n )\n )\n else:\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(coefficients).to(**tkwargs),\n -(rhs + c.rhs), # type: ignore\n )\n )\n return constraints\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_multiobjective_objective","title":"get_multiobjective_objective(outputs)
","text":"Returns
Parameters:
Name Type Description Defaultoutputs
Outputs
description
requiredReturns:
Type DescriptionCallable[[Tensor], Tensor]
description
Source code inbofire/utils/torch_tools.py
def get_multiobjective_objective(\n outputs: Outputs,\n) -> Callable[[Tensor, Optional[Tensor]], Tensor]:\n \"\"\"Returns\n\n Args:\n outputs (Outputs): _description_\n\n Returns:\n Callable[[Tensor], Tensor]: _description_\n \"\"\"\n callables = [\n get_objective_callable(idx=i, objective=feat.objective) # type: ignore\n for i, feat in enumerate(outputs.get())\n if feat.objective is not None # type: ignore\n and isinstance(\n feat.objective, # type: ignore\n (MaximizeObjective, MinimizeObjective, CloseToTargetObjective),\n )\n ]\n\n def objective(samples: Tensor, X: Optional[Tensor] = None) -> Tensor:\n return torch.stack([c(samples, None) for c in callables], dim=-1)\n\n return objective\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_nchoosek_constraints","title":"get_nchoosek_constraints(domain)
","text":"Transforms NChooseK constraints into a list of non-linear inequality constraint callables that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered at zero.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
List of callables that can be used as nonlinear equality constraints in botorch.
Source code inbofire/utils/torch_tools.py
def get_nchoosek_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"Transforms NChooseK constraints into a list of non-linear inequality constraint callables\n that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously\n relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered\n at zero.\n\n Args:\n domain (Domain): Optimization problem definition.\n\n Returns:\n List[Callable[[Tensor], float]]: List of callables that can be used\n as nonlinear equality constraints in botorch.\n \"\"\"\n\n def narrow_gaussian(x, ell=1e-3):\n return torch.exp(-0.5 * (x / ell) ** 2)\n\n def max_constraint(indices: Tensor, num_features: int, max_count: int):\n return lambda x: narrow_gaussian(x=x[..., indices]).sum(dim=-1) - (\n num_features - max_count\n )\n\n def min_constraint(indices: Tensor, num_features: int, min_count: int):\n return lambda x: -narrow_gaussian(x=x[..., indices]).sum(dim=-1) + (\n num_features - min_count\n )\n\n constraints = []\n # ignore none also valid for the start\n for c in domain.constraints.get(NChooseKConstraint):\n assert isinstance(c, NChooseKConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n if c.max_count != len(c.features):\n constraints.append(\n max_constraint(\n indices=indices, num_features=len(c.features), max_count=c.max_count\n )\n )\n if c.min_count > 0:\n constraints.append(\n min_constraint(\n indices=indices, num_features=len(c.features), min_count=c.min_count\n )\n )\n return constraints\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_nonlinear_constraints","title":"get_nonlinear_constraints(domain)
","text":"Returns a list of callable functions that represent the nonlinear constraints for the given domain that can be processed by botorch.
Parameters:
Name Type Description Defaultdomain
Domain
The domain for which to generate the nonlinear constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of callable functions that take a tensor as input and return a float value representing the constraint evaluation.
Source code inbofire/utils/torch_tools.py
def get_nonlinear_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of callable functions that represent the nonlinear constraints\n for the given domain that can be processed by botorch.\n\n Parameters:\n domain (Domain): The domain for which to generate the nonlinear constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of callable functions that take a tensor\n as input and return a float value representing the constraint evaluation.\n \"\"\"\n return get_nchoosek_constraints(domain) + get_product_constraints(domain)\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_output_constraints","title":"get_output_constraints(outputs)
","text":"Method to translate output constraint objectives into a list of callables and list of etas for use in botorch.
Parameters:
Name Type Description Defaultoutputs
Outputs
Output feature object that should be processed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float]]
List of constraint callables, list of associated etas.
Source code inbofire/utils/torch_tools.py
def get_output_constraints(\n outputs: Outputs,\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float]]:\n \"\"\"Method to translate output constraint objectives into a list of\n callables and list of etas for use in botorch.\n\n Args:\n outputs (Outputs): Output feature object that should\n be processed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float]]: List of constraint callables,\n list of associated etas.\n \"\"\"\n constraints = []\n etas = []\n idx = 0\n for feat in outputs.get():\n if isinstance(feat.objective, ConstrainedObjective): # type: ignore\n iconstraints, ietas, idx = constrained_objective2botorch(\n idx,\n objective=feat.objective, # type: ignore\n )\n constraints += iconstraints\n etas += ietas\n else:\n idx += 1\n return constraints, etas\n
"},{"location":"ref-domain-util/#bofire.utils.torch_tools.get_product_constraints","title":"get_product_constraints(domain)
","text":"Returns a list of nonlinear constraint functions that can be processed by botorch based on the given domain.
Parameters:
Name Type Description Defaultdomain
Domain
The domain object containing the constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of product constraint functions.
Source code inbofire/utils/torch_tools.py
def get_product_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of nonlinear constraint functions that can be processed by botorch\n based on the given domain.\n\n Args:\n domain (Domain): The domain object containing the constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of product constraint functions.\n\n \"\"\"\n\n def product_constraint(indices: Tensor, exponents: Tensor, rhs: float, sign: int):\n return lambda x: -1.0 * sign * (x[..., indices] ** exponents).prod(dim=-1) + rhs\n\n constraints = []\n for c in domain.constraints.get(ProductInequalityConstraint):\n assert isinstance(c, ProductInequalityConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n constraints.append(\n product_constraint(indices, torch.tensor(c.exponents), c.rhs, c.sign)\n )\n return constraints\n
"},{"location":"ref-domain/","title":"Domain","text":""},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain","title":" Domain (BaseModel)
","text":"Source code in bofire/data_models/domain/domain.py
class Domain(BaseModel):\n type: Literal[\"Domain\"] = \"Domain\"\n\n inputs: Inputs = Field(default_factory=lambda: Inputs())\n outputs: Outputs = Field(default_factory=lambda: Outputs())\n constraints: Constraints = Field(default_factory=lambda: Constraints())\n\n \"\"\"Representation of the optimization problem/domain\n\n Attributes:\n inputs (List[Input], optional): List of input features. Defaults to [].\n outputs (List[Output], optional): List of output features. Defaults to [].\n constraints (List[Constraint], optional): List of constraints. Defaults to [].\n \"\"\"\n\n @classmethod\n def from_lists(\n cls,\n inputs: Optional[Sequence[AnyInput]] = None,\n outputs: Optional[Sequence[AnyOutput]] = None,\n constraints: Optional[Sequence[AnyConstraint]] = None,\n ):\n inputs = [] if inputs is None else inputs\n outputs = [] if outputs is None else outputs\n constraints = [] if constraints is None else constraints\n return cls(\n inputs=Inputs(features=inputs),\n outputs=Outputs(features=outputs),\n constraints=Constraints(constraints=constraints),\n )\n\n @field_validator(\"inputs\", mode=\"before\")\n @classmethod\n def validate_inputs_list(cls, v):\n if isinstance(v, collections.abc.Sequence):\n v = Inputs(features=v)\n return v\n if isinstance_or_union(v, AnyInput):\n return Inputs(features=[v])\n else:\n return v\n\n @field_validator(\"outputs\", mode=\"before\")\n @classmethod\n def validate_outputs_list(cls, v):\n if isinstance(v, collections.abc.Sequence):\n return Outputs(features=v)\n if isinstance_or_union(v, AnyOutput):\n return Outputs(features=[v])\n else:\n return v\n\n @field_validator(\"constraints\", mode=\"before\")\n @classmethod\n def validate_constraints_list(cls, v):\n if isinstance(v, list):\n return Constraints(constraints=v)\n if isinstance_or_union(v, AnyConstraint):\n return Constraints(constraints=[v])\n else:\n return v\n\n @model_validator(mode=\"after\")\n def validate_unique_feature_keys(self):\n \"\"\"Validates if provided input and output feature keys are unique\n\n Args:\n v (Outputs): List of all output features of the domain.\n value (Dict[str, Inputs]): Dict containing a list of input features as single entry.\n\n Raises:\n ValueError: Feature keys are not unique.\n\n Returns:\n Outputs: Keeps output features as given.\n \"\"\"\n\n keys = self.outputs.get_keys() + self.inputs.get_keys()\n if len(set(keys)) != len(keys):\n raise ValueError(\"Feature keys are not unique\")\n return self\n\n @model_validator(mode=\"after\")\n def validate_constraints(self):\n \"\"\"Validate if all features included in the constraints are also defined as features for the domain.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: Feature key in constraint is unknown.\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n\n keys = self.inputs.get_keys()\n for c in self.constraints.get(\n [LinearConstraint, NChooseKConstraint, ProductConstraint]\n ):\n for f in c.features: # type: ignore\n if f not in keys:\n raise ValueError(f\"feature {f} in constraint unknown ({keys})\")\n return self\n\n @model_validator(mode=\"after\")\n def validate_linear_constraints_and_nchoosek(self):\n \"\"\"Validate if all features included in linear constraints are continuous ones.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: _description_\n\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n keys = self.inputs.get_keys(ContinuousInput)\n\n # check if non continuous input features appear in linear constraints\n for c in self.constraints.get(includes=[LinearConstraint, NChooseKConstraint]):\n for f in c.features: # type: ignore\n assert f in keys, f\"{f} must be continuous.\"\n return self\n\n # TODO: tidy this up\n def get_nchoosek_combinations(self, exhaustive: bool = False): # noqa: C901\n \"\"\"get all possible NChooseK combinations\n\n Args:\n exhaustive (bool, optional): if True all combinations are returned. Defaults to False.\n\n Returns:\n Tuple(used_features_list, unused_features_list): used_features_list is a list of lists containing features used in each NChooseK combination.\n unused_features_list is a list of lists containing features unused in each NChooseK combination.\n \"\"\"\n\n if len(self.constraints.get(NChooseKConstraint)) == 0:\n used_continuous_features = self.inputs.get_keys(ContinuousInput)\n return used_continuous_features, []\n\n used_features_list_all = []\n\n # loops through each NChooseK constraint\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n used_features_list = []\n\n if exhaustive:\n for n in range(con.min_count, con.max_count + 1):\n used_features_list.extend(itertools.combinations(con.features, n))\n\n if con.none_also_valid:\n used_features_list.append(())\n else:\n used_features_list.extend(\n itertools.combinations(con.features, con.max_count)\n )\n\n used_features_list_all.append(used_features_list)\n\n used_features_list_all = list(\n itertools.product(*used_features_list_all)\n ) # product between NChooseK constraints\n\n # format into a list of used features\n used_features_list_formatted = []\n for used_features_list in used_features_list_all:\n used_features_list_flattened = [\n item for sublist in used_features_list for item in sublist\n ]\n used_features_list_formatted.append(list(set(used_features_list_flattened)))\n\n # sort lists\n used_features_list_sorted = []\n for used_features in used_features_list_formatted:\n used_features_list_sorted.append(sorted(used_features))\n\n # drop duplicates\n used_features_list_no_dup = []\n for used_features in used_features_list_sorted:\n if used_features not in used_features_list_no_dup:\n used_features_list_no_dup.append(used_features)\n\n # print(f\"duplicates dropped: {len(used_features_list_sorted)-len(used_features_list_no_dup)}\")\n\n # remove combinations not fulfilling constraints\n used_features_list_final = []\n for combo in used_features_list_no_dup:\n fulfil_constraints = (\n []\n ) # list of bools tracking if constraints are fulfilled\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n count = 0 # count of features in combo that are in con.features\n for f in combo:\n if f in con.features:\n count += 1\n if count >= con.min_count and count <= con.max_count:\n fulfil_constraints.append(True)\n elif count == 0 and con.none_also_valid:\n fulfil_constraints.append(True)\n else:\n fulfil_constraints.append(False)\n if np.all(fulfil_constraints):\n used_features_list_final.append(combo)\n\n # print(f\"violators dropped: {len(used_features_list_no_dup)-len(used_features_list_final)}\")\n\n # features unused\n features_in_cc = []\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n features_in_cc.extend(con.features)\n features_in_cc = list(set(features_in_cc))\n features_in_cc.sort()\n unused_features_list = []\n for used_features in used_features_list_final:\n unused_features_list.append(\n [f_key for f_key in features_in_cc if f_key not in used_features]\n )\n\n # postprocess\n # used_features_list_final2 = []\n # unused_features_list2 = []\n # for used, unused in zip(used_features_list_final,unused_features_list):\n # if len(used) == 3:\n # used_features_list_final2.append(used), unused_features_list2.append(unused)\n\n return used_features_list_final, unused_features_list\n\n def coerce_invalids(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Coerces all invalid output measurements to np.nan\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n\n Returns:\n pd.DataFrame: coerced dataframe\n \"\"\"\n # coerce invalid to nan\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[f\"valid_{feat}\"] == 0, feat] = np.nan\n return experiments\n\n def aggregate_by_duplicates(\n self,\n experiments: pd.DataFrame,\n prec: int,\n delimiter: str = \"-\",\n method: Literal[\"mean\", \"median\"] = \"mean\",\n ) -> Tuple[pd.DataFrame, list]:\n \"\"\"Aggregate the dataframe by duplicate experiments\n\n Duplicates are identified based on the experiments with the same input features. Continuous input features\n are rounded before identifying the duplicates. Aggregation is performed by taking the average of the\n involved output features.\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n prec (int): Precision of the rounding of the continuous input features\n delimiter (str, optional): Delimiter used when combining the orig. labcodes to a new one. Defaults to \"-\".\n\n Returns:\n Tuple[pd.DataFrame, list]: Dataframe holding the aggregated experiments, list of lists holding the labcodes of the duplicates\n \"\"\"\n # prepare the parent frame\n if method not in [\"mean\", \"median\"]:\n raise ValueError(f\"Unknown aggregation type provided: {method}\")\n\n preprocessed = self.outputs.preprocess_experiments_any_valid_output(experiments)\n assert preprocessed is not None\n experiments = preprocessed.copy()\n if \"labcode\" not in experiments.columns:\n experiments[\"labcode\"] = [\n str(i + 1).zfill(int(np.ceil(np.log10(experiments.shape[0]))))\n for i in range(experiments.shape[0])\n ]\n\n # round it if continuous inputs are present\n if len(self.inputs.get(ContinuousInput)) > 0:\n experiments[self.inputs.get_keys(ContinuousInput)] = experiments[\n self.inputs.get_keys(ContinuousInput)\n ].round(prec)\n\n # coerce invalid to nan\n experiments = self.coerce_invalids(experiments)\n\n # group and aggregate\n agg: Dict[str, Any] = {\n feat: method for feat in self.outputs.get_keys(ContinuousOutput)\n }\n agg[\"labcode\"] = lambda x: delimiter.join(sorted(x.tolist()))\n for feat in self.outputs.get_keys(Output):\n agg[f\"valid_{feat}\"] = lambda x: 1\n\n grouped = experiments.groupby(self.inputs.get_keys(Input))\n duplicated_labcodes = [\n sorted(group.labcode.to_numpy().tolist())\n for _, group in grouped\n if group.shape[0] > 1\n ]\n\n experiments = grouped.aggregate(agg).reset_index(drop=False)\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[feat].isna(), f\"valid_{feat}\"] = 0\n\n experiments = experiments.sort_values(by=\"labcode\")\n experiments = experiments.reset_index(drop=True)\n return experiments, sorted(duplicated_labcodes)\n\n def validate_experiments(\n self,\n experiments: pd.DataFrame,\n strict: bool = False,\n ) -> pd.DataFrame:\n \"\"\"checks the experimental data on validity\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Raises:\n ValueError: empty dataframe\n ValueError: the column for a specific feature is missing the provided data\n ValueError: there are labcodes with null value\n ValueError: there are labcodes with nan value\n ValueError: labcodes are not unique\n ValueError: the provided columns do no match to the defined domain\n ValueError: the provided columns do no match to the defined domain\n ValueError: Input with null values\n ValueError: Input with nan values\n\n Returns:\n pd.DataFrame: The provided dataframe with experimental data\n \"\"\"\n\n if len(experiments) == 0:\n raise ValueError(\"no experiments provided (empty dataframe)\")\n # we allow here for a column named labcode used to identify experiments\n if \"labcode\" in experiments.columns:\n # test that labcodes are not na\n if experiments.labcode.isnull().to_numpy().any():\n raise ValueError(\"there are labcodes with null value\")\n if experiments.labcode.isna().to_numpy().any():\n raise ValueError(\"there are labcodes with nan value\")\n # test that labcodes are distinct\n if (\n len(set(experiments.labcode.to_numpy().tolist()))\n != experiments.shape[0]\n ):\n raise ValueError(\"labcodes are not unique\")\n # run the individual validators\n experiments = self.inputs.validate_experiments(\n experiments=experiments, strict=strict\n )\n experiments = self.outputs.validate_experiments(experiments=experiments)\n return experiments\n\n def describe_experiments(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Method to get a tabular overview of how many measurements and how many valid entries are included in the input data for each output feature\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Returns:\n pd.DataFrame: Dataframe with counts how many measurements and how many valid entries are included in the input data for each output feature\n \"\"\"\n data = {}\n for feat in self.outputs.get_keys(Output):\n data[feat] = [\n experiments.loc[experiments[feat].notna()].shape[0],\n experiments.loc[experiments[feat].notna(), \"valid_%s\" % feat].sum(),\n ]\n preprocessed = self.outputs.preprocess_experiments_all_valid_outputs(\n experiments\n )\n assert preprocessed is not None\n data[\"all\"] = [\n experiments.shape[0],\n preprocessed.shape[0],\n ]\n return pd.DataFrame.from_dict(\n data, orient=\"index\", columns=[\"measured\", \"valid\"]\n )\n\n def validate_candidates(\n self,\n candidates: pd.DataFrame,\n only_inputs: bool = False,\n tol: float = 1e-5,\n raise_validation_error: bool = True,\n ) -> pd.DataFrame:\n \"\"\"Method to check the validty of porposed candidates\n\n Args:\n candidates (pd.DataFrame): Dataframe with suggested new experiments (candidates)\n only_inputs (bool,optional): If True, only the input columns are validated. Defaults to False.\n tol (float,optional): tolerance parameter for constraints. A constraint is considered as not fulfilled if the violation\n is larger than tol. Defaults to 1e-6.\n raise_validation_error (bool, optional): If true an error will be raised if candidates violate constraints,\n otherwise only a warning will be displayed. Defaults to True.\n\n Raises:\n ValueError: when a column is missing for a defined input feature\n ValueError: when a column is missing for a defined output feature\n ValueError: when a non-numerical value is proposed\n ValueError: when an additional column is found\n ConstraintNotFulfilledError: when the constraints are not fulfilled and `raise_validation_error = True`\n\n Returns:\n pd.DataFrame: dataframe with suggested experiments (candidates)\n \"\"\"\n # check that each input feature has a col and is valid in itself\n assert isinstance(self.inputs, Inputs)\n candidates = self.inputs.validate_candidates(candidates)\n # check if all constraints are fulfilled\n if not self.constraints.is_fulfilled(candidates, tol=tol).all():\n if raise_validation_error:\n raise ConstraintNotFulfilledError(\n f\"Constraints not fulfilled: {candidates}\"\n )\n warnings.warn(\"Not all constraints are fulfilled.\")\n # for each continuous output feature with an attached objective object\n if not only_inputs:\n assert isinstance(self.outputs, Outputs)\n candidates = self.outputs.validate_candidates(candidates=candidates)\n return candidates\n\n @property\n def experiment_column_names(self):\n \"\"\"the columns in the experimental dataframe\n\n Returns:\n List[str]: List of columns in the experiment dataframe (output feature keys + valid_output feature keys)\n \"\"\"\n return (self.inputs + self.outputs).get_keys() + [\n f\"valid_{output_feature_key}\"\n for output_feature_key in self.outputs.get_keys(Output)\n ]\n\n @property\n def candidate_column_names(self):\n \"\"\"the columns in the candidate dataframe\n\n Returns:\n List[str]: List of columns in the candidate dataframe (input feature keys + input feature keys_pred, input feature keys_sd, input feature keys_des)\n \"\"\"\n assert isinstance(self.outputs, Outputs)\n return (\n self.inputs.get_keys(Input)\n + [\n f\"{output_feature_key}_pred\"\n for output_feature_key in self.outputs.get_keys_by_objective(Objective)\n ]\n + [\n f\"{output_feature_key}_sd\"\n for output_feature_key in self.outputs.get_keys_by_objective(Objective)\n ]\n + [\n f\"{output_feature_key}_des\"\n for output_feature_key in self.outputs.get_keys_by_objective(Objective)\n ]\n )\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.candidate_column_names","title":"candidate_column_names
property
readonly
","text":"the columns in the candidate dataframe
Returns:
Type DescriptionList[str]
List of columns in the candidate dataframe (input feature keys + input feature keys_pred, input feature keys_sd, input feature keys_des)
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.experiment_column_names","title":"experiment_column_names
property
readonly
","text":"the columns in the experimental dataframe
Returns:
Type DescriptionList[str]
List of columns in the experiment dataframe (output feature keys + valid_output feature keys)
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.aggregate_by_duplicates","title":"aggregate_by_duplicates(self, experiments, prec, delimiter='-', method='mean')
","text":"Aggregate the dataframe by duplicate experiments
Duplicates are identified based on the experiments with the same input features. Continuous input features are rounded before identifying the duplicates. Aggregation is performed by taking the average of the involved output features.
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe containing experimental data
requiredprec
int
Precision of the rounding of the continuous input features
requireddelimiter
str
Delimiter used when combining the orig. labcodes to a new one. Defaults to \"-\".
'-'
Returns:
Type DescriptionTuple[pd.DataFrame, list]
Dataframe holding the aggregated experiments, list of lists holding the labcodes of the duplicates
Source code inbofire/data_models/domain/domain.py
def aggregate_by_duplicates(\n self,\n experiments: pd.DataFrame,\n prec: int,\n delimiter: str = \"-\",\n method: Literal[\"mean\", \"median\"] = \"mean\",\n) -> Tuple[pd.DataFrame, list]:\n \"\"\"Aggregate the dataframe by duplicate experiments\n\n Duplicates are identified based on the experiments with the same input features. Continuous input features\n are rounded before identifying the duplicates. Aggregation is performed by taking the average of the\n involved output features.\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n prec (int): Precision of the rounding of the continuous input features\n delimiter (str, optional): Delimiter used when combining the orig. labcodes to a new one. Defaults to \"-\".\n\n Returns:\n Tuple[pd.DataFrame, list]: Dataframe holding the aggregated experiments, list of lists holding the labcodes of the duplicates\n \"\"\"\n # prepare the parent frame\n if method not in [\"mean\", \"median\"]:\n raise ValueError(f\"Unknown aggregation type provided: {method}\")\n\n preprocessed = self.outputs.preprocess_experiments_any_valid_output(experiments)\n assert preprocessed is not None\n experiments = preprocessed.copy()\n if \"labcode\" not in experiments.columns:\n experiments[\"labcode\"] = [\n str(i + 1).zfill(int(np.ceil(np.log10(experiments.shape[0]))))\n for i in range(experiments.shape[0])\n ]\n\n # round it if continuous inputs are present\n if len(self.inputs.get(ContinuousInput)) > 0:\n experiments[self.inputs.get_keys(ContinuousInput)] = experiments[\n self.inputs.get_keys(ContinuousInput)\n ].round(prec)\n\n # coerce invalid to nan\n experiments = self.coerce_invalids(experiments)\n\n # group and aggregate\n agg: Dict[str, Any] = {\n feat: method for feat in self.outputs.get_keys(ContinuousOutput)\n }\n agg[\"labcode\"] = lambda x: delimiter.join(sorted(x.tolist()))\n for feat in self.outputs.get_keys(Output):\n agg[f\"valid_{feat}\"] = lambda x: 1\n\n grouped = experiments.groupby(self.inputs.get_keys(Input))\n duplicated_labcodes = [\n sorted(group.labcode.to_numpy().tolist())\n for _, group in grouped\n if group.shape[0] > 1\n ]\n\n experiments = grouped.aggregate(agg).reset_index(drop=False)\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[feat].isna(), f\"valid_{feat}\"] = 0\n\n experiments = experiments.sort_values(by=\"labcode\")\n experiments = experiments.reset_index(drop=True)\n return experiments, sorted(duplicated_labcodes)\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.coerce_invalids","title":"coerce_invalids(self, experiments)
","text":"Coerces all invalid output measurements to np.nan
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe containing experimental data
requiredReturns:
Type Descriptionpd.DataFrame
coerced dataframe
Source code inbofire/data_models/domain/domain.py
def coerce_invalids(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Coerces all invalid output measurements to np.nan\n\n Args:\n experiments (pd.DataFrame): Dataframe containing experimental data\n\n Returns:\n pd.DataFrame: coerced dataframe\n \"\"\"\n # coerce invalid to nan\n for feat in self.outputs.get_keys(Output):\n experiments.loc[experiments[f\"valid_{feat}\"] == 0, feat] = np.nan\n return experiments\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.describe_experiments","title":"describe_experiments(self, experiments)
","text":"Method to get a tabular overview of how many measurements and how many valid entries are included in the input data for each output feature
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe with experimental data
requiredReturns:
Type Descriptionpd.DataFrame
Dataframe with counts how many measurements and how many valid entries are included in the input data for each output feature
Source code inbofire/data_models/domain/domain.py
def describe_experiments(self, experiments: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Method to get a tabular overview of how many measurements and how many valid entries are included in the input data for each output feature\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Returns:\n pd.DataFrame: Dataframe with counts how many measurements and how many valid entries are included in the input data for each output feature\n \"\"\"\n data = {}\n for feat in self.outputs.get_keys(Output):\n data[feat] = [\n experiments.loc[experiments[feat].notna()].shape[0],\n experiments.loc[experiments[feat].notna(), \"valid_%s\" % feat].sum(),\n ]\n preprocessed = self.outputs.preprocess_experiments_all_valid_outputs(\n experiments\n )\n assert preprocessed is not None\n data[\"all\"] = [\n experiments.shape[0],\n preprocessed.shape[0],\n ]\n return pd.DataFrame.from_dict(\n data, orient=\"index\", columns=[\"measured\", \"valid\"]\n )\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.get_nchoosek_combinations","title":"get_nchoosek_combinations(self, exhaustive=False)
","text":"get all possible NChooseK combinations
Parameters:
Name Type Description Defaultexhaustive
bool
if True all combinations are returned. Defaults to False.
False
Returns:
Type DescriptionTuple(used_features_list, unused_features_list)
used_features_list is a list of lists containing features used in each NChooseK combination. unused_features_list is a list of lists containing features unused in each NChooseK combination.
Source code inbofire/data_models/domain/domain.py
def get_nchoosek_combinations(self, exhaustive: bool = False): # noqa: C901\n \"\"\"get all possible NChooseK combinations\n\n Args:\n exhaustive (bool, optional): if True all combinations are returned. Defaults to False.\n\n Returns:\n Tuple(used_features_list, unused_features_list): used_features_list is a list of lists containing features used in each NChooseK combination.\n unused_features_list is a list of lists containing features unused in each NChooseK combination.\n \"\"\"\n\n if len(self.constraints.get(NChooseKConstraint)) == 0:\n used_continuous_features = self.inputs.get_keys(ContinuousInput)\n return used_continuous_features, []\n\n used_features_list_all = []\n\n # loops through each NChooseK constraint\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n used_features_list = []\n\n if exhaustive:\n for n in range(con.min_count, con.max_count + 1):\n used_features_list.extend(itertools.combinations(con.features, n))\n\n if con.none_also_valid:\n used_features_list.append(())\n else:\n used_features_list.extend(\n itertools.combinations(con.features, con.max_count)\n )\n\n used_features_list_all.append(used_features_list)\n\n used_features_list_all = list(\n itertools.product(*used_features_list_all)\n ) # product between NChooseK constraints\n\n # format into a list of used features\n used_features_list_formatted = []\n for used_features_list in used_features_list_all:\n used_features_list_flattened = [\n item for sublist in used_features_list for item in sublist\n ]\n used_features_list_formatted.append(list(set(used_features_list_flattened)))\n\n # sort lists\n used_features_list_sorted = []\n for used_features in used_features_list_formatted:\n used_features_list_sorted.append(sorted(used_features))\n\n # drop duplicates\n used_features_list_no_dup = []\n for used_features in used_features_list_sorted:\n if used_features not in used_features_list_no_dup:\n used_features_list_no_dup.append(used_features)\n\n # print(f\"duplicates dropped: {len(used_features_list_sorted)-len(used_features_list_no_dup)}\")\n\n # remove combinations not fulfilling constraints\n used_features_list_final = []\n for combo in used_features_list_no_dup:\n fulfil_constraints = (\n []\n ) # list of bools tracking if constraints are fulfilled\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n count = 0 # count of features in combo that are in con.features\n for f in combo:\n if f in con.features:\n count += 1\n if count >= con.min_count and count <= con.max_count:\n fulfil_constraints.append(True)\n elif count == 0 and con.none_also_valid:\n fulfil_constraints.append(True)\n else:\n fulfil_constraints.append(False)\n if np.all(fulfil_constraints):\n used_features_list_final.append(combo)\n\n # print(f\"violators dropped: {len(used_features_list_no_dup)-len(used_features_list_final)}\")\n\n # features unused\n features_in_cc = []\n for con in self.constraints.get(NChooseKConstraint):\n assert isinstance(con, NChooseKConstraint)\n features_in_cc.extend(con.features)\n features_in_cc = list(set(features_in_cc))\n features_in_cc.sort()\n unused_features_list = []\n for used_features in used_features_list_final:\n unused_features_list.append(\n [f_key for f_key in features_in_cc if f_key not in used_features]\n )\n\n # postprocess\n # used_features_list_final2 = []\n # unused_features_list2 = []\n # for used, unused in zip(used_features_list_final,unused_features_list):\n # if len(used) == 3:\n # used_features_list_final2.append(used), unused_features_list2.append(unused)\n\n return used_features_list_final, unused_features_list\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_candidates","title":"validate_candidates(self, candidates, only_inputs=False, tol=1e-05, raise_validation_error=True)
","text":"Method to check the validty of porposed candidates
Parameters:
Name Type Description Defaultcandidates
pd.DataFrame
Dataframe with suggested new experiments (candidates)
requiredonly_inputs
bool,optional
If True, only the input columns are validated. Defaults to False.
False
tol
float,optional
tolerance parameter for constraints. A constraint is considered as not fulfilled if the violation is larger than tol. Defaults to 1e-6.
1e-05
raise_validation_error
bool
If true an error will be raised if candidates violate constraints, otherwise only a warning will be displayed. Defaults to True.
True
Exceptions:
Type DescriptionValueError
when a column is missing for a defined input feature
ValueError
when a column is missing for a defined output feature
ValueError
when a non-numerical value is proposed
ValueError
when an additional column is found
ConstraintNotFulfilledError
when the constraints are not fulfilled and raise_validation_error = True
Returns:
Type Descriptionpd.DataFrame
dataframe with suggested experiments (candidates)
Source code inbofire/data_models/domain/domain.py
def validate_candidates(\n self,\n candidates: pd.DataFrame,\n only_inputs: bool = False,\n tol: float = 1e-5,\n raise_validation_error: bool = True,\n) -> pd.DataFrame:\n \"\"\"Method to check the validty of porposed candidates\n\n Args:\n candidates (pd.DataFrame): Dataframe with suggested new experiments (candidates)\n only_inputs (bool,optional): If True, only the input columns are validated. Defaults to False.\n tol (float,optional): tolerance parameter for constraints. A constraint is considered as not fulfilled if the violation\n is larger than tol. Defaults to 1e-6.\n raise_validation_error (bool, optional): If true an error will be raised if candidates violate constraints,\n otherwise only a warning will be displayed. Defaults to True.\n\n Raises:\n ValueError: when a column is missing for a defined input feature\n ValueError: when a column is missing for a defined output feature\n ValueError: when a non-numerical value is proposed\n ValueError: when an additional column is found\n ConstraintNotFulfilledError: when the constraints are not fulfilled and `raise_validation_error = True`\n\n Returns:\n pd.DataFrame: dataframe with suggested experiments (candidates)\n \"\"\"\n # check that each input feature has a col and is valid in itself\n assert isinstance(self.inputs, Inputs)\n candidates = self.inputs.validate_candidates(candidates)\n # check if all constraints are fulfilled\n if not self.constraints.is_fulfilled(candidates, tol=tol).all():\n if raise_validation_error:\n raise ConstraintNotFulfilledError(\n f\"Constraints not fulfilled: {candidates}\"\n )\n warnings.warn(\"Not all constraints are fulfilled.\")\n # for each continuous output feature with an attached objective object\n if not only_inputs:\n assert isinstance(self.outputs, Outputs)\n candidates = self.outputs.validate_candidates(candidates=candidates)\n return candidates\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_constraints","title":"validate_constraints(self)
","text":"Validate if all features included in the constraints are also defined as features for the domain.
Parameters:
Name Type Description Defaultv
List[Constraint]
List of constraints or empty if no constraints are defined
requiredvalues
List[Input]
List of input features of the domain
requiredExceptions:
Type DescriptionValueError
Feature key in constraint is unknown.
Returns:
Type DescriptionList[Constraint]
List of constraints defined for the domain
Source code inbofire/data_models/domain/domain.py
@model_validator(mode=\"after\")\ndef validate_constraints(self):\n \"\"\"Validate if all features included in the constraints are also defined as features for the domain.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: Feature key in constraint is unknown.\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n\n keys = self.inputs.get_keys()\n for c in self.constraints.get(\n [LinearConstraint, NChooseKConstraint, ProductConstraint]\n ):\n for f in c.features: # type: ignore\n if f not in keys:\n raise ValueError(f\"feature {f} in constraint unknown ({keys})\")\n return self\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_experiments","title":"validate_experiments(self, experiments, strict=False)
","text":"checks the experimental data on validity
Parameters:
Name Type Description Defaultexperiments
pd.DataFrame
Dataframe with experimental data
requiredExceptions:
Type DescriptionValueError
empty dataframe
ValueError
the column for a specific feature is missing the provided data
ValueError
there are labcodes with null value
ValueError
there are labcodes with nan value
ValueError
labcodes are not unique
ValueError
the provided columns do no match to the defined domain
ValueError
the provided columns do no match to the defined domain
ValueError
Input with null values
ValueError
Input with nan values
Returns:
Type Descriptionpd.DataFrame
The provided dataframe with experimental data
Source code inbofire/data_models/domain/domain.py
def validate_experiments(\n self,\n experiments: pd.DataFrame,\n strict: bool = False,\n) -> pd.DataFrame:\n \"\"\"checks the experimental data on validity\n\n Args:\n experiments (pd.DataFrame): Dataframe with experimental data\n\n Raises:\n ValueError: empty dataframe\n ValueError: the column for a specific feature is missing the provided data\n ValueError: there are labcodes with null value\n ValueError: there are labcodes with nan value\n ValueError: labcodes are not unique\n ValueError: the provided columns do no match to the defined domain\n ValueError: the provided columns do no match to the defined domain\n ValueError: Input with null values\n ValueError: Input with nan values\n\n Returns:\n pd.DataFrame: The provided dataframe with experimental data\n \"\"\"\n\n if len(experiments) == 0:\n raise ValueError(\"no experiments provided (empty dataframe)\")\n # we allow here for a column named labcode used to identify experiments\n if \"labcode\" in experiments.columns:\n # test that labcodes are not na\n if experiments.labcode.isnull().to_numpy().any():\n raise ValueError(\"there are labcodes with null value\")\n if experiments.labcode.isna().to_numpy().any():\n raise ValueError(\"there are labcodes with nan value\")\n # test that labcodes are distinct\n if (\n len(set(experiments.labcode.to_numpy().tolist()))\n != experiments.shape[0]\n ):\n raise ValueError(\"labcodes are not unique\")\n # run the individual validators\n experiments = self.inputs.validate_experiments(\n experiments=experiments, strict=strict\n )\n experiments = self.outputs.validate_experiments(experiments=experiments)\n return experiments\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_linear_constraints_and_nchoosek","title":"validate_linear_constraints_and_nchoosek(self)
","text":"Validate if all features included in linear constraints are continuous ones.
Parameters:
Name Type Description Defaultv
List[Constraint]
List of constraints or empty if no constraints are defined
requiredvalues
List[Input]
List of input features of the domain
requiredExceptions:
Type DescriptionValueError
description
Returns:
Type DescriptionList[Constraint]
List of constraints defined for the domain
Source code inbofire/data_models/domain/domain.py
@model_validator(mode=\"after\")\ndef validate_linear_constraints_and_nchoosek(self):\n \"\"\"Validate if all features included in linear constraints are continuous ones.\n\n Args:\n v (List[Constraint]): List of constraints or empty if no constraints are defined\n values (List[Input]): List of input features of the domain\n\n Raises:\n ValueError: _description_\n\n\n Returns:\n List[Constraint]: List of constraints defined for the domain\n \"\"\"\n keys = self.inputs.get_keys(ContinuousInput)\n\n # check if non continuous input features appear in linear constraints\n for c in self.constraints.get(includes=[LinearConstraint, NChooseKConstraint]):\n for f in c.features: # type: ignore\n assert f in keys, f\"{f} must be continuous.\"\n return self\n
"},{"location":"ref-domain/#bofire.data_models.domain.domain.Domain.validate_unique_feature_keys","title":"validate_unique_feature_keys(self)
","text":"Validates if provided input and output feature keys are unique
Parameters:
Name Type Description Defaultv
Outputs
List of all output features of the domain.
requiredvalue
Dict[str, Inputs]
Dict containing a list of input features as single entry.
requiredExceptions:
Type DescriptionValueError
Feature keys are not unique.
Returns:
Type DescriptionOutputs
Keeps output features as given.
Source code inbofire/data_models/domain/domain.py
@model_validator(mode=\"after\")\ndef validate_unique_feature_keys(self):\n \"\"\"Validates if provided input and output feature keys are unique\n\n Args:\n v (Outputs): List of all output features of the domain.\n value (Dict[str, Inputs]): Dict containing a list of input features as single entry.\n\n Raises:\n ValueError: Feature keys are not unique.\n\n Returns:\n Outputs: Keeps output features as given.\n \"\"\"\n\n keys = self.outputs.get_keys() + self.inputs.get_keys()\n if len(set(keys)) != len(keys):\n raise ValueError(\"Feature keys are not unique\")\n return self\n
"},{"location":"ref-features/","title":"Domain","text":""},{"location":"ref-features/#bofire.data_models.features.categorical","title":"categorical
","text":""},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput","title":" CategoricalInput (Input)
","text":"Base class for all categorical input features.
Attributes:
Name Type Descriptioncategories
List[str]
Names of the categories.
allowed
List[bool]
List of bools indicating if a category is allowed within the optimization.
Source code inbofire/data_models/features/categorical.py
class CategoricalInput(Input):\n \"\"\"Base class for all categorical input features.\n\n Attributes:\n categories (List[str]): Names of the categories.\n allowed (List[bool]): List of bools indicating if a category is allowed within the optimization.\n \"\"\"\n\n type: Literal[\"CategoricalInput\"] = \"CategoricalInput\"\n # order_id: ClassVar[int] = 5\n order_id: ClassVar[int] = 7\n\n categories: CategoryVals\n allowed: Optional[Annotated[List[bool], Field(min_length=2)]] = Field(\n default=None, validate_default=True\n )\n\n @field_validator(\"allowed\")\n @classmethod\n def generate_allowed(cls, allowed, info):\n \"\"\"Generates the list of allowed categories if not provided.\"\"\"\n if allowed is None and \"categories\" in info.data.keys():\n return [True for _ in range(len(info.data[\"categories\"]))]\n return allowed\n\n @model_validator(mode=\"after\")\n def validate_categories_fitting_allowed(self):\n if len(self.allowed) != len(self.categories): # type: ignore\n raise ValueError(\"allowed must have same length as categories\")\n if sum(self.allowed) == 0: # type: ignore\n raise ValueError(\"no category is allowed\")\n return self\n\n @staticmethod\n def valid_transform_types() -> List[CategoricalEncodingEnum]:\n return [\n CategoricalEncodingEnum.ONE_HOT,\n CategoricalEncodingEnum.DUMMY,\n CategoricalEncodingEnum.ORDINAL,\n ]\n\n def is_fixed(self) -> bool:\n \"\"\"Returns True if there is only one allowed category.\n\n Returns:\n [bool]: True if there is only one allowed category\n \"\"\"\n if self.allowed is None:\n return False\n return sum(self.allowed) == 1\n\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if self.is_fixed():\n val = self.get_allowed_categories()[0]\n if transform_type is None:\n return [val]\n elif transform_type == CategoricalEncodingEnum.ONE_HOT:\n return self.to_onehot_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.DUMMY:\n return self.to_dummy_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.ORDINAL:\n return self.to_ordinal_encoding(pd.Series([val])).tolist()\n else:\n raise ValueError(\n f\"Unkwon transform type {transform_type} for categorical input {self.key}\"\n )\n else:\n return None\n\n def get_allowed_categories(self):\n \"\"\"Returns the allowed categories.\n\n Returns:\n list of str: The allowed categories\n \"\"\"\n if self.allowed is None:\n return []\n return [c for c, a in zip(self.categories, self.allowed) if a]\n\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n if strict:\n possible_categories = self.get_possible_categories(values)\n if len(possible_categories) != len(self.categories):\n raise ValueError(\n f\"Categories {list(set(self.categories)-set(possible_categories))} of feature {self.key} not used. Remove them.\"\n )\n return values\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when not all values for a feature are one of the allowed categories\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.get_allowed_categories())) != len(values):\n raise ValueError(\n f\"not all values of input feature `{self.key}` are a valid allowed category from {self.get_allowed_categories()}\"\n )\n return values\n\n def get_forbidden_categories(self):\n \"\"\"Returns the non-allowed categories\n\n Returns:\n List[str]: List of the non-allowed categories\n \"\"\"\n return list(set(self.categories) - set(self.get_allowed_categories()))\n\n def get_possible_categories(self, values: pd.Series) -> list:\n \"\"\"Return the superset of categories that have been used in the experimental dataset and\n that can be used in the optimization\n\n Args:\n values (pd.Series): Series with the values for this feature\n\n Returns:\n list: list of possible categories\n \"\"\"\n return sorted(set(list(set(values.tolist())) + self.get_allowed_categories()))\n\n def to_onehot_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a one-hot encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: One-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories},\n dtype=float,\n index=values.index,\n )\n\n def from_onehot_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from one-hot encoding.\n\n Args:\n values (pd.DataFrame): One-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n\n def to_dummy_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a dummy-hot encoding, dropping the first categorical level.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: Dummy-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories[1:]},\n dtype=float,\n index=values.index,\n )\n\n def from_dummy_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Convert points back from dummy encoding.\n\n Args:\n values (pd.DataFrame): Dummy-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols[1:]]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols[1:]}.\"\n )\n values = values.copy()\n values[cat_cols[0]] = 1 - values[cat_cols[1:]].sum(axis=1)\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n\n def to_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Converts values to an ordinal integer based encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.Series: Ordinal encoded values.\n \"\"\"\n enc = pd.Series(range(len(self.categories)), index=list(self.categories))\n s = enc[values]\n s.index = values.index\n s.name = self.key\n return s\n\n def from_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Convertes values back from ordinal encoding.\n\n Args:\n values (pd.Series): Ordinal encoded series.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n enc = np.array(self.categories)\n return pd.Series(enc[values], index=values.index, name=self.key)\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).choice(\n self.get_allowed_categories(), n\n ),\n )\n\n def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n assert isinstance(transform_type, CategoricalEncodingEnum)\n if transform_type == CategoricalEncodingEnum.ORDINAL:\n return [0], [len(self.categories) - 1]\n if transform_type == CategoricalEncodingEnum.ONE_HOT:\n # in the case that values are None, we return the bounds\n # based on the optimization bounds, else we return the true\n # bounds as this is for model fitting.\n if values is None:\n lower = [0.0 for _ in self.categories]\n upper = [\n 1.0 if self.allowed[i] is True else 0.0 # type: ignore\n for i, _ in enumerate(self.categories)\n ]\n else:\n lower = [0.0 for _ in self.categories]\n upper = [1.0 for _ in self.categories]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DUMMY:\n lower = [0.0 for _ in range(len(self.categories) - 1)]\n upper = [1.0 for _ in range(len(self.categories) - 1)]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DESCRIPTOR:\n raise ValueError(\n f\"Invalid descriptor transform for categorical {self.key}.\"\n )\n else:\n raise ValueError(\n f\"Invalid transform_type {transform_type} provided for categorical {self.key}.\"\n )\n\n def __str__(self) -> str:\n \"\"\"Returns the number of categories as str\n\n Returns:\n str: Number of categories\n \"\"\"\n return f\"{len(self.categories)} categories\"\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.__str__","title":"__str__(self)
special
","text":"Returns the number of categories as str
Returns:
Type Descriptionstr
Number of categories
Source code inbofire/data_models/features/categorical.py
def __str__(self) -> str:\n \"\"\"Returns the number of categories as str\n\n Returns:\n str: Number of categories\n \"\"\"\n return f\"{len(self.categories)} categories\"\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Returns the categories to which the feature is fixed, None if the feature is not fixed
Returns:
Type DescriptionList[str]
List of categories or None
Source code inbofire/data_models/features/categorical.py
def fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if self.is_fixed():\n val = self.get_allowed_categories()[0]\n if transform_type is None:\n return [val]\n elif transform_type == CategoricalEncodingEnum.ONE_HOT:\n return self.to_onehot_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.DUMMY:\n return self.to_dummy_encoding(pd.Series([val])).values[0].tolist()\n elif transform_type == CategoricalEncodingEnum.ORDINAL:\n return self.to_ordinal_encoding(pd.Series([val])).tolist()\n else:\n raise ValueError(\n f\"Unkwon transform type {transform_type} for categorical input {self.key}\"\n )\n else:\n return None\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.from_dummy_encoding","title":"from_dummy_encoding(self, values)
","text":"Convert points back from dummy encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Dummy-hot encoded values.
requiredExceptions:
Type DescriptionValueError
If one-hot columns not present in values
.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/categorical.py
def from_dummy_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Convert points back from dummy encoding.\n\n Args:\n values (pd.DataFrame): Dummy-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols[1:]]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols[1:]}.\"\n )\n values = values.copy()\n values[cat_cols[0]] = 1 - values[cat_cols[1:]].sum(axis=1)\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.from_onehot_encoding","title":"from_onehot_encoding(self, values)
","text":"Converts values back from one-hot encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
One-hot encoded values.
requiredExceptions:
Type DescriptionValueError
If one-hot columns not present in values
.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/categorical.py
def from_onehot_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from one-hot encoding.\n\n Args:\n values (pd.DataFrame): One-hot encoded values.\n\n Raises:\n ValueError: If one-hot columns not present in `values`.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, c) for c in self.categories]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = values[cat_cols].idxmax(1).str[(len(self.key) + 1) :]\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.from_ordinal_encoding","title":"from_ordinal_encoding(self, values)
","text":"Convertes values back from ordinal encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Ordinal encoded series.
requiredReturns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/categorical.py
def from_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Convertes values back from ordinal encoding.\n\n Args:\n values (pd.Series): Ordinal encoded series.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n enc = np.array(self.categories)\n return pd.Series(enc[values], index=values.index, name=self.key)\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.generate_allowed","title":"generate_allowed(allowed, info)
classmethod
","text":"Generates the list of allowed categories if not provided.
Source code inbofire/data_models/features/categorical.py
@field_validator(\"allowed\")\n@classmethod\ndef generate_allowed(cls, allowed, info):\n \"\"\"Generates the list of allowed categories if not provided.\"\"\"\n if allowed is None and \"categories\" in info.data.keys():\n return [True for _ in range(len(info.data[\"categories\"]))]\n return allowed\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_allowed_categories","title":"get_allowed_categories(self)
","text":"Returns the allowed categories.
Returns:
Type Descriptionlist of str
The allowed categories
Source code inbofire/data_models/features/categorical.py
def get_allowed_categories(self):\n \"\"\"Returns the allowed categories.\n\n Returns:\n list of str: The allowed categories\n \"\"\"\n if self.allowed is None:\n return []\n return [c for c, a in zip(self.categories, self.allowed) if a]\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_bounds","title":"get_bounds(self, transform_type, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
requiredvalues
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/categorical.py
def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n assert isinstance(transform_type, CategoricalEncodingEnum)\n if transform_type == CategoricalEncodingEnum.ORDINAL:\n return [0], [len(self.categories) - 1]\n if transform_type == CategoricalEncodingEnum.ONE_HOT:\n # in the case that values are None, we return the bounds\n # based on the optimization bounds, else we return the true\n # bounds as this is for model fitting.\n if values is None:\n lower = [0.0 for _ in self.categories]\n upper = [\n 1.0 if self.allowed[i] is True else 0.0 # type: ignore\n for i, _ in enumerate(self.categories)\n ]\n else:\n lower = [0.0 for _ in self.categories]\n upper = [1.0 for _ in self.categories]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DUMMY:\n lower = [0.0 for _ in range(len(self.categories) - 1)]\n upper = [1.0 for _ in range(len(self.categories) - 1)]\n return lower, upper\n if transform_type == CategoricalEncodingEnum.DESCRIPTOR:\n raise ValueError(\n f\"Invalid descriptor transform for categorical {self.key}.\"\n )\n else:\n raise ValueError(\n f\"Invalid transform_type {transform_type} provided for categorical {self.key}.\"\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_forbidden_categories","title":"get_forbidden_categories(self)
","text":"Returns the non-allowed categories
Returns:
Type DescriptionList[str]
List of the non-allowed categories
Source code inbofire/data_models/features/categorical.py
def get_forbidden_categories(self):\n \"\"\"Returns the non-allowed categories\n\n Returns:\n List[str]: List of the non-allowed categories\n \"\"\"\n return list(set(self.categories) - set(self.get_allowed_categories()))\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.get_possible_categories","title":"get_possible_categories(self, values)
","text":"Return the superset of categories that have been used in the experimental dataset and that can be used in the optimization
Parameters:
Name Type Description Defaultvalues
pd.Series
Series with the values for this feature
requiredReturns:
Type Descriptionlist
list of possible categories
Source code inbofire/data_models/features/categorical.py
def get_possible_categories(self, values: pd.Series) -> list:\n \"\"\"Return the superset of categories that have been used in the experimental dataset and\n that can be used in the optimization\n\n Args:\n values (pd.Series): Series with the values for this feature\n\n Returns:\n list: list of possible categories\n \"\"\"\n return sorted(set(list(set(values.tolist())) + self.get_allowed_categories()))\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.is_fixed","title":"is_fixed(self)
","text":"Returns True if there is only one allowed category.
Returns:
Type Description[bool]
True if there is only one allowed category
Source code inbofire/data_models/features/categorical.py
def is_fixed(self) -> bool:\n \"\"\"Returns True if there is only one allowed category.\n\n Returns:\n [bool]: True if there is only one allowed category\n \"\"\"\n if self.allowed is None:\n return False\n return sum(self.allowed) == 1\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.sample","title":"sample(self, n, seed=None)
","text":"Draw random samples from the feature.
Parameters:
Name Type Description Defaultn
int
number of samples.
requiredReturns:
Type Descriptionpd.Series
drawn samples.
Source code inbofire/data_models/features/categorical.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).choice(\n self.get_allowed_categories(), n\n ),\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.to_dummy_encoding","title":"to_dummy_encoding(self, values)
","text":"Converts values to a dummy-hot encoding, dropping the first categorical level.
Parameters:
Name Type Description Defaultvalues
pd.Series
Series to be transformed.
requiredReturns:
Type Descriptionpd.DataFrame
Dummy-hot transformed data frame.
Source code inbofire/data_models/features/categorical.py
def to_dummy_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a dummy-hot encoding, dropping the first categorical level.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: Dummy-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories[1:]},\n dtype=float,\n index=values.index,\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.to_onehot_encoding","title":"to_onehot_encoding(self, values)
","text":"Converts values to a one-hot encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Series to be transformed.
requiredReturns:
Type Descriptionpd.DataFrame
One-hot transformed data frame.
Source code inbofire/data_models/features/categorical.py
def to_onehot_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to a one-hot encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.DataFrame: One-hot transformed data frame.\n \"\"\"\n return pd.DataFrame(\n {get_encoded_name(self.key, c): values == c for c in self.categories},\n dtype=float,\n index=values.index,\n )\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.to_ordinal_encoding","title":"to_ordinal_encoding(self, values)
","text":"Converts values to an ordinal integer based encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Series to be transformed.
requiredReturns:
Type Descriptionpd.Series
Ordinal encoded values.
Source code inbofire/data_models/features/categorical.py
def to_ordinal_encoding(self, values: pd.Series) -> pd.Series:\n \"\"\"Converts values to an ordinal integer based encoding.\n\n Args:\n values (pd.Series): Series to be transformed.\n\n Returns:\n pd.Series: Ordinal encoded values.\n \"\"\"\n enc = pd.Series(range(len(self.categories)), index=list(self.categories))\n s = enc[values]\n s.index = values.index\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredExceptions:
Type DescriptionValueError
when not all values for a feature are one of the allowed categories
Returns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/categorical.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when not all values for a feature are one of the allowed categories\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.get_allowed_categories())) != len(values):\n raise ValueError(\n f\"not all values of input feature `{self.key}` are a valid allowed category from {self.get_allowed_categories()}\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Exceptions:
Type DescriptionValueError
when an entry is not in the list of allowed categories
ValueError
when there is no variation in a feature provided by the experimental data
Returns:
Type Descriptionpd.Series
A dataFrame with experiments
Source code inbofire/data_models/features/categorical.py
def validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n if strict:\n possible_categories = self.get_possible_categories(values)\n if len(possible_categories) != len(self.categories):\n raise ValueError(\n f\"Categories {list(set(self.categories)-set(possible_categories))} of feature {self.key} not used. Remove them.\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalOutput","title":" CategoricalOutput (Output)
","text":"Source code in bofire/data_models/features/categorical.py
class CategoricalOutput(Output):\n type: Literal[\"CategoricalOutput\"] = \"CategoricalOutput\"\n order_id: ClassVar[int] = 10\n\n categories: CategoryVals\n objective: AnyCategoricalObjective\n\n @model_validator(mode=\"after\")\n def validate_objective_categories(self):\n \"\"\"validates that objective categories match the output categories\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n self\n \"\"\"\n if self.objective.categories != self.categories: # type: ignore\n raise ValueError(\"categories must match to objective categories\")\n return self\n\n def __call__(self, values: pd.Series) -> pd.Series:\n if self.objective is None:\n return pd.Series(\n data=[np.nan for _ in range(len(values))],\n index=values.index,\n name=values.name,\n )\n return self.objective(values) # type: ignore\n\n def validate_experimental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n return values\n\n def __str__(self) -> str:\n return \"CategoricalOutputFeature\"\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalOutput.validate_experimental","title":"validate_experimental(self, values)
","text":"Abstract method to validate the experimental Series
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with values for the outcome
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/categorical.py
def validate_experimental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n if sum(values.isin(self.categories)) != len(values):\n raise ValueError(\n f\"invalid values for `{self.key}`, allowed are: `{self.categories}`\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.categorical.CategoricalOutput.validate_objective_categories","title":"validate_objective_categories(self)
","text":"validates that objective categories match the output categories
Exceptions:
Type DescriptionValueError
when categories do not match objective categories
Returns:
Type Descriptionself
Source code inbofire/data_models/features/categorical.py
@model_validator(mode=\"after\")\ndef validate_objective_categories(self):\n \"\"\"validates that objective categories match the output categories\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n self\n \"\"\"\n if self.objective.categories != self.categories: # type: ignore\n raise ValueError(\"categories must match to objective categories\")\n return self\n
"},{"location":"ref-features/#bofire.data_models.features.continuous","title":"continuous
","text":""},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput","title":" ContinuousInput (NumericalInput)
","text":"Base class for all continuous input features.
Attributes:
Name Type Descriptionbounds
Tuple[float, float]
A tuple that stores the lower and upper bound of the feature.
stepsize
float
Float indicating the allowed stepsize between lower and upper. Defaults to None.
local_relative_bounds
Tuple[float, float]
A tuple that stores the lower and upper bounds relative to a reference value. Defaults to None.
Source code inbofire/data_models/features/continuous.py
class ContinuousInput(NumericalInput):\n \"\"\"Base class for all continuous input features.\n\n Attributes:\n bounds (Tuple[float, float]): A tuple that stores the lower and upper bound of the feature.\n stepsize (float, optional): Float indicating the allowed stepsize between lower and upper. Defaults to None.\n local_relative_bounds (Tuple[float, float], optional): A tuple that stores the lower and upper bounds relative to a reference value.\n Defaults to None.\n \"\"\"\n\n type: Literal[\"ContinuousInput\"] = \"ContinuousInput\"\n order_id: ClassVar[int] = 1\n\n bounds: Tuple[float, float]\n local_relative_bounds: Optional[\n Tuple[Annotated[float, Field(gt=0)], Annotated[float, Field(gt=0)]]\n ] = None\n stepsize: Optional[float] = None\n\n @property\n def lower_bound(self) -> float:\n return self.bounds[0]\n\n @property\n def upper_bound(self) -> float:\n return self.bounds[1]\n\n @model_validator(mode=\"after\")\n def validate_step_size(self):\n if self.stepsize is None:\n return self\n lower, upper = self.bounds\n if lower == upper and self.stepsize is not None:\n raise ValueError(\n \"Stepsize cannot be provided for a fixed continuous input.\"\n )\n range = upper - lower\n if np.arange(lower, upper + self.stepsize, self.stepsize)[-1] != upper:\n raise ValueError(\n f\"Stepsize of {self.stepsize} does not match the provided interval [{lower},{upper}].\"\n )\n if range // self.stepsize == 1:\n raise ValueError(\"Stepsize is too big, only one value allowed.\")\n return self\n\n def round(self, values: pd.Series) -> pd.Series:\n \"\"\"Round values to the stepsize of the feature. If no stepsize is provided return the\n provided values.\n\n Args:\n values (pd.Series): The values that should be rounded.\n\n Returns:\n pd.Series: The rounded values\n \"\"\"\n if self.stepsize is None:\n return values\n self.validate_candidental(values=values)\n allowed_values = np.arange(\n self.lower_bound, self.upper_bound + self.stepsize, self.stepsize\n )\n idx = abs(values.values.reshape([len(values), 1]) - allowed_values).argmin( # type: ignore\n axis=1\n )\n return pd.Series(\n data=self.lower_bound + idx * self.stepsize, index=values.index\n )\n\n @field_validator(\"bounds\")\n @classmethod\n def validate_lower_upper(cls, bounds):\n \"\"\"Validates that the lower bound is lower than the upper bound\n\n Args:\n values (Dict): Dictionary with attributes key, lower and upper bound\n\n Raises:\n ValueError: when the lower bound is higher than the upper bound\n\n Returns:\n Dict: The attributes as dictionary\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when non numerical values are passed\n ValueError: when values are larger than the upper bound of the feature\n ValueError: when values are lower than the lower bound of the feature\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n\n noise = 10e-6\n values = super().validate_candidental(values)\n if (values < self.lower_bound - noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are larger than lower bound `{self.lower_bound}` \"\n )\n if (values > self.upper_bound + noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are smaller than upper bound `{self.upper_bound}` \"\n )\n return values\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).uniform(\n self.lower_bound, self.upper_bound, n\n ),\n )\n\n def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n ) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if reference_value is not None and values is not None:\n raise ValueError(\"Only one can be used, `local_value` or `values`.\")\n if values is None:\n if reference_value is None or self.is_fixed():\n return [self.lower_bound], [self.upper_bound]\n else:\n local_relative_bounds = self.local_relative_bounds or (\n math.inf,\n math.inf,\n )\n return [\n max(\n reference_value - local_relative_bounds[0],\n self.lower_bound,\n )\n ], [\n min(\n reference_value + local_relative_bounds[1],\n self.upper_bound,\n )\n ]\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper]\n\n def __str__(self) -> str:\n \"\"\"Method to return a string of lower and upper bound\n\n Returns:\n str: String of a list with lower and upper bound\n \"\"\"\n return f\"[{self.lower_bound},{self.upper_bound}]\"\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.__str__","title":"__str__(self)
special
","text":"Method to return a string of lower and upper bound
Returns:
Type Descriptionstr
String of a list with lower and upper bound
Source code inbofire/data_models/features/continuous.py
def __str__(self) -> str:\n \"\"\"Method to return a string of lower and upper bound\n\n Returns:\n str: String of a list with lower and upper bound\n \"\"\"\n return f\"[{self.lower_bound},{self.upper_bound}]\"\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.get_bounds","title":"get_bounds(self, transform_type=None, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
None
values
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/continuous.py
def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if reference_value is not None and values is not None:\n raise ValueError(\"Only one can be used, `local_value` or `values`.\")\n if values is None:\n if reference_value is None or self.is_fixed():\n return [self.lower_bound], [self.upper_bound]\n else:\n local_relative_bounds = self.local_relative_bounds or (\n math.inf,\n math.inf,\n )\n return [\n max(\n reference_value - local_relative_bounds[0],\n self.lower_bound,\n )\n ], [\n min(\n reference_value + local_relative_bounds[1],\n self.upper_bound,\n )\n ]\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper]\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.round","title":"round(self, values)
","text":"Round values to the stepsize of the feature. If no stepsize is provided return the provided values.
Parameters:
Name Type Description Defaultvalues
pd.Series
The values that should be rounded.
requiredReturns:
Type Descriptionpd.Series
The rounded values
Source code inbofire/data_models/features/continuous.py
def round(self, values: pd.Series) -> pd.Series:\n \"\"\"Round values to the stepsize of the feature. If no stepsize is provided return the\n provided values.\n\n Args:\n values (pd.Series): The values that should be rounded.\n\n Returns:\n pd.Series: The rounded values\n \"\"\"\n if self.stepsize is None:\n return values\n self.validate_candidental(values=values)\n allowed_values = np.arange(\n self.lower_bound, self.upper_bound + self.stepsize, self.stepsize\n )\n idx = abs(values.values.reshape([len(values), 1]) - allowed_values).argmin( # type: ignore\n axis=1\n )\n return pd.Series(\n data=self.lower_bound + idx * self.stepsize, index=values.index\n )\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.sample","title":"sample(self, n, seed=None)
","text":"Draw random samples from the feature.
Parameters:
Name Type Description Defaultn
int
number of samples.
requiredReturns:
Type Descriptionpd.Series
drawn samples.
Source code inbofire/data_models/features/continuous.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key,\n data=np.random.default_rng(seed=seed).uniform(\n self.lower_bound, self.upper_bound, n\n ),\n )\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredExceptions:
Type DescriptionValueError
when non numerical values are passed
ValueError
when values are larger than the upper bound of the feature
ValueError
when values are lower than the lower bound of the feature
Returns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/continuous.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Raises:\n ValueError: when non numerical values are passed\n ValueError: when values are larger than the upper bound of the feature\n ValueError: when values are lower than the lower bound of the feature\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n\n noise = 10e-6\n values = super().validate_candidental(values)\n if (values < self.lower_bound - noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are larger than lower bound `{self.lower_bound}` \"\n )\n if (values > self.upper_bound + noise).any():\n raise ValueError(\n f\"not all values of input feature `{self.key}`are smaller than upper bound `{self.upper_bound}` \"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousInput.validate_lower_upper","title":"validate_lower_upper(bounds)
classmethod
","text":"Validates that the lower bound is lower than the upper bound
Parameters:
Name Type Description Defaultvalues
Dict
Dictionary with attributes key, lower and upper bound
requiredExceptions:
Type DescriptionValueError
when the lower bound is higher than the upper bound
Returns:
Type DescriptionDict
The attributes as dictionary
Source code inbofire/data_models/features/continuous.py
@field_validator(\"bounds\")\n@classmethod\ndef validate_lower_upper(cls, bounds):\n \"\"\"Validates that the lower bound is lower than the upper bound\n\n Args:\n values (Dict): Dictionary with attributes key, lower and upper bound\n\n Raises:\n ValueError: when the lower bound is higher than the upper bound\n\n Returns:\n Dict: The attributes as dictionary\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousOutput","title":" ContinuousOutput (Output)
","text":"The base class for a continuous output feature
Attributes:
Name Type Descriptionobjective
objective
objective of the feature indicating in which direction it should be optimzed. Defaults to MaximizeObjective
.
bofire/data_models/features/continuous.py
class ContinuousOutput(Output):\n \"\"\"The base class for a continuous output feature\n\n Attributes:\n objective (objective, optional): objective of the feature indicating in which direction it should be optimzed. Defaults to `MaximizeObjective`.\n \"\"\"\n\n type: Literal[\"ContinuousOutput\"] = \"ContinuousOutput\"\n order_id: ClassVar[int] = 9\n unit: Optional[str] = None\n\n objective: Optional[AnyObjective] = Field(\n default_factory=lambda: MaximizeObjective(w=1.0)\n )\n\n def __call__(self, values: pd.Series) -> pd.Series:\n if self.objective is None:\n return pd.Series(\n data=[np.nan for _ in range(len(values))],\n index=values.index,\n name=values.name,\n )\n return self.objective(values) # type: ignore\n\n def validate_experimental(self, values: pd.Series) -> pd.Series:\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n\n def __str__(self) -> str:\n return \"ContinuousOutputFeature\"\n
"},{"location":"ref-features/#bofire.data_models.features.continuous.ContinuousOutput.validate_experimental","title":"validate_experimental(self, values)
","text":"Abstract method to validate the experimental Series
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with values for the outcome
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/continuous.py
def validate_experimental(self, values: pd.Series) -> pd.Series:\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor","title":"descriptor
","text":""},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput","title":" CategoricalDescriptorInput (CategoricalInput)
","text":"Class for categorical input features with descriptors
Attributes:
Name Type Descriptioncategories
List[str]
Names of the categories.
allowed
List[bool]
List of bools indicating if a category is allowed within the optimization.
descriptors
List[str]
List of strings representing the names of the descriptors.
values
List[List[float]]
List of lists representing the descriptor values.
Source code inbofire/data_models/features/descriptor.py
class CategoricalDescriptorInput(CategoricalInput):\n \"\"\"Class for categorical input features with descriptors\n\n Attributes:\n categories (List[str]): Names of the categories.\n allowed (List[bool]): List of bools indicating if a category is allowed within the optimization.\n descriptors (List[str]): List of strings representing the names of the descriptors.\n values (List[List[float]]): List of lists representing the descriptor values.\n \"\"\"\n\n type: Literal[\"CategoricalDescriptorInput\"] = \"CategoricalDescriptorInput\"\n order_id: ClassVar[int] = 6\n\n descriptors: Descriptors\n values: Annotated[\n List[List[float]],\n Field(min_length=1),\n ]\n\n @field_validator(\"values\")\n @classmethod\n def validate_values(cls, v, info):\n \"\"\"validates the compatability of passed values for the descriptors and the defined categories\n\n Args:\n v (List[List[float]]): Nested list with descriptor values\n values (Dict): Dictionary with attributes\n\n Raises:\n ValueError: when values have different length than categories\n ValueError: when rows in values have different length than descriptors\n ValueError: when a descriptor shows no variance in the data\n\n Returns:\n List[List[float]]: Nested list with descriptor values\n \"\"\"\n if len(v) != len(info.data[\"categories\"]):\n raise ValueError(\"values must have same length as categories\")\n for row in v:\n if len(row) != len(info.data[\"descriptors\"]):\n raise ValueError(\"rows in values must have same length as descriptors\")\n a = np.array(v)\n for i, d in enumerate(info.data[\"descriptors\"]):\n if len(set(a[:, i])) == 1:\n raise ValueError(f\"No variation for descriptor {d}.\")\n return v\n\n @staticmethod\n def valid_transform_types() -> List[CategoricalEncodingEnum]:\n return [\n CategoricalEncodingEnum.ONE_HOT,\n CategoricalEncodingEnum.DUMMY,\n CategoricalEncodingEnum.ORDINAL,\n CategoricalEncodingEnum.DESCRIPTOR,\n ]\n\n def to_df(self):\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n data = dict(zip(self.categories, self.values))\n return pd.DataFrame.from_dict(data, orient=\"index\", columns=self.descriptors)\n\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().fixed_value(transform_type)\n else:\n val = self.get_allowed_categories()[0]\n return self.to_descriptor_encoding(pd.Series([val])).values[0].tolist()\n\n def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().get_bounds(transform_type, values)\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n if values is None:\n df = self.to_df().loc[self.get_allowed_categories()]\n else:\n df = self.to_df()\n lower = df.min().values.tolist() # type: ignore\n upper = df.max().values.tolist() # type: ignore\n return lower, upper\n\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n ValueError: when no variation is present or planed for a given descriptor\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = super().validate_experimental(values, strict)\n if strict:\n lower, upper = self.get_bounds(\n transform_type=CategoricalEncodingEnum.DESCRIPTOR, values=values\n )\n for i, desc in enumerate(self.descriptors):\n if lower[i] == upper[i]:\n raise ValueError(\n f\"No variation present or planned for descriptor {desc} for feature {self.key}. Remove the descriptor.\"\n )\n return values\n\n @classmethod\n def from_df(cls, key: str, df: pd.DataFrame):\n \"\"\"Creates a feature from a dataframe\n\n Args:\n key (str): The name of the feature\n df (pd.DataFrame): Categories as rows and descriptors as columns\n\n Returns:\n _type_: _description_\n \"\"\"\n return cls(\n key=key,\n categories=list(df.index),\n allowed=[True for _ in range(len(df))],\n descriptors=list(df.columns),\n values=df.values.tolist(),\n )\n\n def to_descriptor_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n return pd.DataFrame(\n data=values.map(dict(zip(self.categories, self.values))).values.tolist(), # type: ignore\n columns=[get_encoded_name(self.key, d) for d in self.descriptors],\n index=values.index,\n )\n\n def from_descriptor_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, d) for d in self.descriptors]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_df().iloc[self.allowed].to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Returns the categories to which the feature is fixed, None if the feature is not fixed
Returns:
Type DescriptionList[str]
List of categories or None
Source code inbofire/data_models/features/descriptor.py
def fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[List[str], List[float], None]:\n \"\"\"Returns the categories to which the feature is fixed, None if the feature is not fixed\n\n Returns:\n List[str]: List of categories or None\n \"\"\"\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().fixed_value(transform_type)\n else:\n val = self.get_allowed_categories()[0]\n return self.to_descriptor_encoding(pd.Series([val])).values[0].tolist()\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.from_descriptor_encoding","title":"from_descriptor_encoding(self, values)
","text":"Converts values back from descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Descriptor encoded dataframe.
requiredExceptions:
Type DescriptionValueError
If descriptor columns not found in the dataframe.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/descriptor.py
def from_descriptor_encoding(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n cat_cols = [get_encoded_name(self.key, d) for d in self.descriptors]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_df().iloc[self.allowed].to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.from_df","title":"from_df(key, df)
classmethod
","text":"Creates a feature from a dataframe
Parameters:
Name Type Description Defaultkey
str
The name of the feature
requireddf
pd.DataFrame
Categories as rows and descriptors as columns
requiredReturns:
Type Description_type_
description
Source code inbofire/data_models/features/descriptor.py
@classmethod\ndef from_df(cls, key: str, df: pd.DataFrame):\n \"\"\"Creates a feature from a dataframe\n\n Args:\n key (str): The name of the feature\n df (pd.DataFrame): Categories as rows and descriptors as columns\n\n Returns:\n _type_: _description_\n \"\"\"\n return cls(\n key=key,\n categories=list(df.index),\n allowed=[True for _ in range(len(df))],\n descriptors=list(df.columns),\n values=df.values.tolist(),\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.get_bounds","title":"get_bounds(self, transform_type, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
requiredvalues
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/descriptor.py
def get_bounds(\n self,\n transform_type: TTransform,\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n if transform_type != CategoricalEncodingEnum.DESCRIPTOR:\n return super().get_bounds(transform_type, values)\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n if values is None:\n df = self.to_df().loc[self.get_allowed_categories()]\n else:\n df = self.to_df()\n lower = df.min().values.tolist() # type: ignore\n upper = df.max().values.tolist() # type: ignore\n return lower, upper\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.to_descriptor_encoding","title":"to_descriptor_encoding(self, values)
","text":"Converts values to descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Values to transform.
requiredReturns:
Type Descriptionpd.DataFrame
Descriptor encoded dataframe.
Source code inbofire/data_models/features/descriptor.py
def to_descriptor_encoding(self, values: pd.Series) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n return pd.DataFrame(\n data=values.map(dict(zip(self.categories, self.values))).values.tolist(), # type: ignore\n columns=[get_encoded_name(self.key, d) for d in self.descriptors],\n index=values.index,\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.to_df","title":"to_df(self)
","text":"tabular overview of the feature as DataFrame
Returns:
Type Descriptionpd.DataFrame
tabular overview of the feature as DataFrame
Source code inbofire/data_models/features/descriptor.py
def to_df(self):\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n data = dict(zip(self.categories, self.values))\n return pd.DataFrame.from_dict(data, orient=\"index\", columns=self.descriptors)\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Exceptions:
Type DescriptionValueError
when an entry is not in the list of allowed categories
ValueError
when there is no variation in a feature provided by the experimental data
ValueError
when no variation is present or planed for a given descriptor
Returns:
Type Descriptionpd.Series
A dataFrame with experiments
Source code inbofire/data_models/features/descriptor.py
def validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Raises:\n ValueError: when an entry is not in the list of allowed categories\n ValueError: when there is no variation in a feature provided by the experimental data\n ValueError: when no variation is present or planed for a given descriptor\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n values = super().validate_experimental(values, strict)\n if strict:\n lower, upper = self.get_bounds(\n transform_type=CategoricalEncodingEnum.DESCRIPTOR, values=values\n )\n for i, desc in enumerate(self.descriptors):\n if lower[i] == upper[i]:\n raise ValueError(\n f\"No variation present or planned for descriptor {desc} for feature {self.key}. Remove the descriptor.\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.CategoricalDescriptorInput.validate_values","title":"validate_values(v, info)
classmethod
","text":"validates the compatability of passed values for the descriptors and the defined categories
Parameters:
Name Type Description Defaultv
List[List[float]]
Nested list with descriptor values
requiredvalues
Dict
Dictionary with attributes
requiredExceptions:
Type DescriptionValueError
when values have different length than categories
ValueError
when rows in values have different length than descriptors
ValueError
when a descriptor shows no variance in the data
Returns:
Type DescriptionList[List[float]]
Nested list with descriptor values
Source code inbofire/data_models/features/descriptor.py
@field_validator(\"values\")\n@classmethod\ndef validate_values(cls, v, info):\n \"\"\"validates the compatability of passed values for the descriptors and the defined categories\n\n Args:\n v (List[List[float]]): Nested list with descriptor values\n values (Dict): Dictionary with attributes\n\n Raises:\n ValueError: when values have different length than categories\n ValueError: when rows in values have different length than descriptors\n ValueError: when a descriptor shows no variance in the data\n\n Returns:\n List[List[float]]: Nested list with descriptor values\n \"\"\"\n if len(v) != len(info.data[\"categories\"]):\n raise ValueError(\"values must have same length as categories\")\n for row in v:\n if len(row) != len(info.data[\"descriptors\"]):\n raise ValueError(\"rows in values must have same length as descriptors\")\n a = np.array(v)\n for i, d in enumerate(info.data[\"descriptors\"]):\n if len(set(a[:, i])) == 1:\n raise ValueError(f\"No variation for descriptor {d}.\")\n return v\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.ContinuousDescriptorInput","title":" ContinuousDescriptorInput (ContinuousInput)
","text":"Class for continuous input features with descriptors
Attributes:
Name Type Descriptionlower_bound
float
Lower bound of the feature in the optimization.
upper_bound
float
Upper bound of the feature in the optimization.
descriptors
List[str]
Names of the descriptors.
values
List[float]
Values of the descriptors.
Source code inbofire/data_models/features/descriptor.py
class ContinuousDescriptorInput(ContinuousInput):\n \"\"\"Class for continuous input features with descriptors\n\n Attributes:\n lower_bound (float): Lower bound of the feature in the optimization.\n upper_bound (float): Upper bound of the feature in the optimization.\n descriptors (List[str]): Names of the descriptors.\n values (List[float]): Values of the descriptors.\n \"\"\"\n\n type: Literal[\"ContinuousDescriptorInput\"] = \"ContinuousDescriptorInput\"\n order_id: ClassVar[int] = 2\n\n descriptors: Descriptors\n values: DiscreteVals\n\n @model_validator(mode=\"after\")\n def validate_list_lengths(self):\n \"\"\"compares the length of the defined descriptors list with the provided values\n\n Args:\n values (Dict): Dictionary with all attribues\n\n Raises:\n ValueError: when the number of descriptors does not math the number of provided values\n\n Returns:\n Dict: Dict with the attributes\n \"\"\"\n if len(self.descriptors) != len(self.values):\n raise ValueError(\n 'must provide same number of descriptors and values, got {len(values[\"descriptors\"])} != {len(values[\"values\"])}'\n )\n return self\n\n def to_df(self) -> pd.DataFrame:\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n return pd.DataFrame(\n data=[self.values], index=[self.key], columns=self.descriptors\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.ContinuousDescriptorInput.to_df","title":"to_df(self)
","text":"tabular overview of the feature as DataFrame
Returns:
Type Descriptionpd.DataFrame
tabular overview of the feature as DataFrame
Source code inbofire/data_models/features/descriptor.py
def to_df(self) -> pd.DataFrame:\n \"\"\"tabular overview of the feature as DataFrame\n\n Returns:\n pd.DataFrame: tabular overview of the feature as DataFrame\n \"\"\"\n return pd.DataFrame(\n data=[self.values], index=[self.key], columns=self.descriptors\n )\n
"},{"location":"ref-features/#bofire.data_models.features.descriptor.ContinuousDescriptorInput.validate_list_lengths","title":"validate_list_lengths(self)
","text":"compares the length of the defined descriptors list with the provided values
Parameters:
Name Type Description Defaultvalues
Dict
Dictionary with all attribues
requiredExceptions:
Type DescriptionValueError
when the number of descriptors does not math the number of provided values
Returns:
Type DescriptionDict
Dict with the attributes
Source code inbofire/data_models/features/descriptor.py
@model_validator(mode=\"after\")\ndef validate_list_lengths(self):\n \"\"\"compares the length of the defined descriptors list with the provided values\n\n Args:\n values (Dict): Dictionary with all attribues\n\n Raises:\n ValueError: when the number of descriptors does not math the number of provided values\n\n Returns:\n Dict: Dict with the attributes\n \"\"\"\n if len(self.descriptors) != len(self.values):\n raise ValueError(\n 'must provide same number of descriptors and values, got {len(values[\"descriptors\"])} != {len(values[\"values\"])}'\n )\n return self\n
"},{"location":"ref-features/#bofire.data_models.features.discrete","title":"discrete
","text":""},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput","title":" DiscreteInput (NumericalInput)
","text":"Feature with discretized ordinal values allowed in the optimization.
Attributes:
Name Type Descriptionkey(str)
key of the feature.
values(List[float])
the discretized allowed values during the optimization.
Source code inbofire/data_models/features/discrete.py
class DiscreteInput(NumericalInput):\n \"\"\"Feature with discretized ordinal values allowed in the optimization.\n\n Attributes:\n key(str): key of the feature.\n values(List[float]): the discretized allowed values during the optimization.\n \"\"\"\n\n type: Literal[\"DiscreteInput\"] = \"DiscreteInput\"\n order_id: ClassVar[int] = 3\n\n values: DiscreteVals\n\n @field_validator(\"values\")\n @classmethod\n def validate_values_unique(cls, values):\n \"\"\"Validates that provided values are unique.\n\n Args:\n values (List[float]): List of values\n\n Raises:\n ValueError: when values are non-unique.\n ValueError: when values contains only one entry.\n ValueError: when values is empty.\n\n Returns:\n List[values]: Sorted list of values\n \"\"\"\n if len(values) != len(set(values)):\n raise ValueError(\"Discrete values must be unique\")\n if len(values) == 1:\n raise ValueError(\n \"Fixed discrete inputs are not supported. Please use a fixed continuous input.\"\n )\n if len(values) == 0:\n raise ValueError(\"No values defined.\")\n return sorted(values)\n\n @property\n def lower_bound(self) -> float:\n \"\"\"Lower bound of the set of allowed values\"\"\"\n return min(self.values)\n\n @property\n def upper_bound(self) -> float:\n \"\"\"Upper bound of the set of allowed values\"\"\"\n return max(self.values)\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the provided candidates.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Raises error when one of the provided values is not contained in the list of allowed values.\n\n Returns:\n pd.Series: _uggested candidates for the feature\n \"\"\"\n values = super().validate_candidental(values)\n if not np.isin(values.to_numpy(), np.array(self.values)).all():\n raise ValueError(\n f\"Not allowed values in candidates for feature {self.key}.\"\n )\n return values\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key, data=np.random.default_rng(seed=seed).choice(self.values, n)\n )\n\n def from_continuous(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Rounds continuous values to the closest discrete ones.\n\n Args:\n values (pd.DataFrame): Dataframe with continuous entries.\n\n Returns:\n pd.Series: Series with discrete values.\n \"\"\"\n\n s = pd.DataFrame(\n data=np.abs(\n (values[self.key].to_numpy()[:, np.newaxis] - np.array(self.values))\n ),\n columns=self.values,\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n\n def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n ) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if values is None:\n return [self.lower_bound], [self.upper_bound] # type: ignore\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper] # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.lower_bound","title":"lower_bound: float
property
readonly
","text":"Lower bound of the set of allowed values
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.upper_bound","title":"upper_bound: float
property
readonly
","text":"Upper bound of the set of allowed values
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.from_continuous","title":"from_continuous(self, values)
","text":"Rounds continuous values to the closest discrete ones.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Dataframe with continuous entries.
requiredReturns:
Type Descriptionpd.Series
Series with discrete values.
Source code inbofire/data_models/features/discrete.py
def from_continuous(self, values: pd.DataFrame) -> pd.Series:\n \"\"\"Rounds continuous values to the closest discrete ones.\n\n Args:\n values (pd.DataFrame): Dataframe with continuous entries.\n\n Returns:\n pd.Series: Series with discrete values.\n \"\"\"\n\n s = pd.DataFrame(\n data=np.abs(\n (values[self.key].to_numpy()[:, np.newaxis] - np.array(self.values))\n ),\n columns=self.values,\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.get_bounds","title":"get_bounds(self, transform_type=None, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
None
values
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/discrete.py
def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[float] = None,\n) -> Tuple[List[float], List[float]]:\n assert transform_type is None\n if values is None:\n return [self.lower_bound], [self.upper_bound] # type: ignore\n lower = min(self.lower_bound, values.min()) # type: ignore\n upper = max(self.upper_bound, values.max()) # type: ignore\n return [lower], [upper] # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.sample","title":"sample(self, n, seed=None)
","text":"Draw random samples from the feature.
Parameters:
Name Type Description Defaultn
int
number of samples.
requiredReturns:
Type Descriptionpd.Series
drawn samples.
Source code inbofire/data_models/features/discrete.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Draw random samples from the feature.\n\n Args:\n n (int): number of samples.\n\n Returns:\n pd.Series: drawn samples.\n \"\"\"\n return pd.Series(\n name=self.key, data=np.random.default_rng(seed=seed).choice(self.values, n)\n )\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Method to validate the provided candidates.
Parameters:
Name Type Description Defaultvalues
pd.Series
suggested candidates for the feature
requiredExceptions:
Type DescriptionValueError
Raises error when one of the provided values is not contained in the list of allowed values.
Returns:
Type Descriptionpd.Series
_uggested candidates for the feature
Source code inbofire/data_models/features/discrete.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Method to validate the provided candidates.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Raises error when one of the provided values is not contained in the list of allowed values.\n\n Returns:\n pd.Series: _uggested candidates for the feature\n \"\"\"\n values = super().validate_candidental(values)\n if not np.isin(values.to_numpy(), np.array(self.values)).all():\n raise ValueError(\n f\"Not allowed values in candidates for feature {self.key}.\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.discrete.DiscreteInput.validate_values_unique","title":"validate_values_unique(values)
classmethod
","text":"Validates that provided values are unique.
Parameters:
Name Type Description Defaultvalues
List[float]
List of values
requiredExceptions:
Type DescriptionValueError
when values are non-unique.
ValueError
when values contains only one entry.
ValueError
when values is empty.
Returns:
Type DescriptionList[values]
Sorted list of values
Source code inbofire/data_models/features/discrete.py
@field_validator(\"values\")\n@classmethod\ndef validate_values_unique(cls, values):\n \"\"\"Validates that provided values are unique.\n\n Args:\n values (List[float]): List of values\n\n Raises:\n ValueError: when values are non-unique.\n ValueError: when values contains only one entry.\n ValueError: when values is empty.\n\n Returns:\n List[values]: Sorted list of values\n \"\"\"\n if len(values) != len(set(values)):\n raise ValueError(\"Discrete values must be unique\")\n if len(values) == 1:\n raise ValueError(\n \"Fixed discrete inputs are not supported. Please use a fixed continuous input.\"\n )\n if len(values) == 0:\n raise ValueError(\"No values defined.\")\n return sorted(values)\n
"},{"location":"ref-features/#bofire.data_models.features.feature","title":"feature
","text":""},{"location":"ref-features/#bofire.data_models.features.feature.Feature","title":" Feature (BaseModel)
","text":"The base class for all features.
Source code inbofire/data_models/features/feature.py
class Feature(BaseModel):\n \"\"\"The base class for all features.\"\"\"\n\n type: str\n key: str\n order_id: ClassVar[int] = -1\n\n def __lt__(self, other) -> bool:\n \"\"\"\n Method to compare two models to get them in the desired order.\n Return True if other is larger than self, else False. (see FEATURE_ORDER)\n\n Args:\n other: The other class to compare to self\n\n Returns:\n bool: True if the other class is larger than self, else False\n \"\"\"\n order_self = self.order_id\n order_other = other.order_id\n if order_self == order_other:\n return self.key < other.key\n else:\n return order_self < order_other\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Feature.__lt__","title":"__lt__(self, other)
special
","text":"Method to compare two models to get them in the desired order. Return True if other is larger than self, else False. (see FEATURE_ORDER)
Parameters:
Name Type Description Defaultother
The other class to compare to self
requiredReturns:
Type Descriptionbool
True if the other class is larger than self, else False
Source code inbofire/data_models/features/feature.py
def __lt__(self, other) -> bool:\n \"\"\"\n Method to compare two models to get them in the desired order.\n Return True if other is larger than self, else False. (see FEATURE_ORDER)\n\n Args:\n other: The other class to compare to self\n\n Returns:\n bool: True if the other class is larger than self, else False\n \"\"\"\n order_self = self.order_id\n order_other = other.order_id\n if order_self == order_other:\n return self.key < other.key\n else:\n return order_self < order_other\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input","title":" Input (Feature)
","text":"Base class for all input features.
Source code inbofire/data_models/features/feature.py
class Input(Feature):\n \"\"\"Base class for all input features.\"\"\"\n\n @staticmethod\n @abstractmethod\n def valid_transform_types() -> List[Union[CategoricalEncodingEnum, AnyMolFeatures]]:\n pass\n\n @abstractmethod\n def is_fixed(self) -> bool:\n \"\"\"Indicates if a variable is set to a fixed value.\n\n Returns:\n bool: True if fixed, els False.\n \"\"\"\n pass\n\n @abstractmethod\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[None, List[str], List[float]]:\n \"\"\"Method to return the fixed value in case of a fixed feature.\n\n Returns:\n Union[None,str,float]: None in case the feature is not fixed, else the fixed value.\n \"\"\"\n pass\n\n @abstractmethod\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n \"\"\"Abstract method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n\n @abstractmethod\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n pass\n\n @abstractmethod\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Sample a series of allowed values.\n\n Args:\n n (int): Number of samples\n\n Returns:\n pd.Series: Sampled values.\n \"\"\"\n pass\n\n @abstractmethod\n def get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[Union[float, str]] = None,\n ) -> Tuple[List[float], List[float]]:\n \"\"\"Returns the bounds of an input feature depending on the requested transform type.\n\n Args:\n transform_type (Optional[TTransform], optional): The requested transform type. Defaults to None.\n values (Optional[pd.Series], optional): If values are provided the bounds are returned taking\n the most extreme values for the feature into account. Defaults to None.\n reference_value (Optional[float], optional): If a reference value is provided, then the local bounds based\n on a local search region are provided. Currently only supported for continuous inputs. For more\n details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.\n Returns:\n Tuple[List[float], List[float]]: List of lower bound values, list of upper bound values.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Method to return the fixed value in case of a fixed feature.
Returns:
Type DescriptionUnion[None,str,float]
None in case the feature is not fixed, else the fixed value.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[None, List[str], List[float]]:\n \"\"\"Method to return the fixed value in case of a fixed feature.\n\n Returns:\n Union[None,str,float]: None in case the feature is not fixed, else the fixed value.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.get_bounds","title":"get_bounds(self, transform_type=None, values=None, reference_value=None)
","text":"Returns the bounds of an input feature depending on the requested transform type.
Parameters:
Name Type Description Defaulttransform_type
Optional[TTransform]
The requested transform type. Defaults to None.
None
values
Optional[pd.Series]
If values are provided the bounds are returned taking the most extreme values for the feature into account. Defaults to None.
None
reference_value
Optional[float]
If a reference value is provided, then the local bounds based on a local search region are provided. Currently only supported for continuous inputs. For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
List of lower bound values, list of upper bound values.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef get_bounds(\n self,\n transform_type: Optional[TTransform] = None,\n values: Optional[pd.Series] = None,\n reference_value: Optional[Union[float, str]] = None,\n) -> Tuple[List[float], List[float]]:\n \"\"\"Returns the bounds of an input feature depending on the requested transform type.\n\n Args:\n transform_type (Optional[TTransform], optional): The requested transform type. Defaults to None.\n values (Optional[pd.Series], optional): If values are provided the bounds are returned taking\n the most extreme values for the feature into account. Defaults to None.\n reference_value (Optional[float], optional): If a reference value is provided, then the local bounds based\n on a local search region are provided. Currently only supported for continuous inputs. For more\n details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf.\n Returns:\n Tuple[List[float], List[float]]: List of lower bound values, list of upper bound values.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.is_fixed","title":"is_fixed(self)
","text":"Indicates if a variable is set to a fixed value.
Returns:
Type Descriptionbool
True if fixed, els False.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef is_fixed(self) -> bool:\n \"\"\"Indicates if a variable is set to a fixed value.\n\n Returns:\n bool: True if fixed, els False.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.sample","title":"sample(self, n, seed=None)
","text":"Sample a series of allowed values.
Parameters:
Name Type Description Defaultn
int
Number of samples
requiredReturns:
Type Descriptionpd.Series
Sampled values.
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n \"\"\"Sample a series of allowed values.\n\n Args:\n n (int): Number of samples\n\n Returns:\n pd.Series: Sampled values.\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.validate_candidental","title":"validate_candidental(self, values)
","text":"Abstract method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the suggested candidates\n\n Args:\n values (pd.Series): A dataFrame with candidates\n\n Returns:\n pd.Series: The passed dataFrame with candidates\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Input.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Abstract method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Returns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n \"\"\"Abstract method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Output","title":" Output (Feature)
","text":"Base class for all output features.
Attributes:
Name Type Descriptionkey(str)
Key of the Feature.
Source code inbofire/data_models/features/feature.py
class Output(Feature):\n \"\"\"Base class for all output features.\n\n Attributes:\n key(str): Key of the Feature.\n \"\"\"\n\n @abstractmethod\n def __call__(self, values: pd.Series) -> pd.Series:\n pass\n\n @abstractmethod\n def validate_experimental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the experimental Series\n\n Args:\n values (pd.Series): A dataFrame with values for the outcome\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.Output.validate_experimental","title":"validate_experimental(self, values)
","text":"Abstract method to validate the experimental Series
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with values for the outcome
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/feature.py
@abstractmethod\ndef validate_experimental(self, values: pd.Series) -> pd.Series:\n \"\"\"Abstract method to validate the experimental Series\n\n Args:\n values (pd.Series): A dataFrame with values for the outcome\n\n Returns:\n pd.Series: The passed dataFrame with experiments\n \"\"\"\n pass\n
"},{"location":"ref-features/#bofire.data_models.features.feature.get_encoded_name","title":"get_encoded_name(feature_key, option_name)
","text":"Get the name of the encoded column. Option could be the category or the descriptor name.
Source code inbofire/data_models/features/feature.py
def get_encoded_name(feature_key: str, option_name: str) -> str:\n \"\"\"Get the name of the encoded column. Option could be the category or the descriptor name.\"\"\"\n return f\"{feature_key}_{option_name}\"\n
"},{"location":"ref-features/#bofire.data_models.features.molecular","title":"molecular
","text":""},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput","title":" CategoricalMolecularInput (CategoricalInput, MolecularInput)
","text":"Source code in bofire/data_models/features/molecular.py
class CategoricalMolecularInput(CategoricalInput, MolecularInput):\n type: Literal[\"CategoricalMolecularInput\"] = \"CategoricalMolecularInput\"\n # order_id: ClassVar[int] = 7\n order_id: ClassVar[int] = 5\n\n @field_validator(\"categories\")\n @classmethod\n def validate_smiles(cls, categories: Sequence[str]):\n \"\"\"validates that categories are valid smiles. Note that this check can only\n be executed when rdkit is available.\n\n Args:\n categories (List[str]): List of smiles\n\n Raises:\n ValueError: when string is not a smiles\n\n Returns:\n List[str]: List of the smiles\n \"\"\"\n # check on rdkit availability:\n try:\n smiles2mol(categories[0])\n except NameError:\n warnings.warn(\"rdkit not installed, categories cannot be validated.\")\n return categories\n\n for cat in categories:\n smiles2mol(cat)\n return categories\n\n @staticmethod\n def valid_transform_types() -> List[Union[AnyMolFeatures, CategoricalEncodingEnum]]:\n return CategoricalInput.valid_transform_types() + [\n Fingerprints,\n FingerprintsFragments,\n Fragments,\n MordredDescriptors, # type: ignore\n ]\n\n def get_bounds(\n self,\n transform_type: Union[CategoricalEncodingEnum, AnyMolFeatures],\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n if isinstance(transform_type, CategoricalEncodingEnum):\n # we are just using the standard categorical transformations\n return super().get_bounds(\n transform_type=transform_type,\n values=values,\n reference_value=reference_value,\n )\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n data = self.to_descriptor_encoding(\n transform_type=transform_type,\n values=(\n pd.Series(self.get_allowed_categories())\n if values is None\n else pd.Series(self.categories)\n ),\n )\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n return lower, upper\n\n def from_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.DataFrame\n ) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n\n # This method is modified based on the categorical descriptor feature\n # TODO: move it to more central place\n cat_cols = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_descriptor_encoding(\n transform_type=transform_type,\n values=pd.Series(self.get_allowed_categories()),\n ).to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput.from_descriptor_encoding","title":"from_descriptor_encoding(self, transform_type, values)
","text":"Converts values back from descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.DataFrame
Descriptor encoded dataframe.
requiredExceptions:
Type DescriptionValueError
If descriptor columns not found in the dataframe.
Returns:
Type Descriptionpd.Series
Series with categorical values.
Source code inbofire/data_models/features/molecular.py
def from_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.DataFrame\n) -> pd.Series:\n \"\"\"Converts values back from descriptor encoding.\n\n Args:\n values (pd.DataFrame): Descriptor encoded dataframe.\n\n Raises:\n ValueError: If descriptor columns not found in the dataframe.\n\n Returns:\n pd.Series: Series with categorical values.\n \"\"\"\n\n # This method is modified based on the categorical descriptor feature\n # TODO: move it to more central place\n cat_cols = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n # we allow here explicitly that the dataframe can have more columns than needed to have it\n # easier in the backtransform.\n if np.any([c not in values.columns for c in cat_cols]):\n raise ValueError(\n f\"{self.key}: Column names don't match categorical levels: {values.columns}, {cat_cols}.\"\n )\n s = pd.DataFrame(\n data=np.sqrt(\n np.sum(\n (\n values[cat_cols].to_numpy()[:, np.newaxis, :]\n - self.to_descriptor_encoding(\n transform_type=transform_type,\n values=pd.Series(self.get_allowed_categories()),\n ).to_numpy()\n )\n ** 2,\n axis=2,\n )\n ),\n columns=self.get_allowed_categories(),\n index=values.index,\n ).idxmin(1)\n s.name = self.key\n return s\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput.get_bounds","title":"get_bounds(self, transform_type, values=None, reference_value=None)
","text":"Calculates the lower and upper bounds for the feature based on the given transform type and values.
Parameters:
Name Type Description Defaulttransform_type
AnyMolFeatures
The type of transformation to apply to the data.
requiredvalues
pd.Series
The actual data over which the lower and upper bounds are calculated.
None
reference_value
Optional[str]
The reference value for the transformation. Not used here. Defaults to None.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
A tuple containing the lower and upper bounds of the transformed data.
Exceptions:
Type DescriptionNotImplementedError
Raised when values
is None, as it is currently required for MolecularInput
.
bofire/data_models/features/molecular.py
def get_bounds(\n self,\n transform_type: Union[CategoricalEncodingEnum, AnyMolFeatures],\n values: Optional[pd.Series] = None,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n if isinstance(transform_type, CategoricalEncodingEnum):\n # we are just using the standard categorical transformations\n return super().get_bounds(\n transform_type=transform_type,\n values=values,\n reference_value=reference_value,\n )\n else:\n # in case that values is None, we return the optimization bounds\n # else we return the complete bounds\n data = self.to_descriptor_encoding(\n transform_type=transform_type,\n values=(\n pd.Series(self.get_allowed_categories())\n if values is None\n else pd.Series(self.categories)\n ),\n )\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n return lower, upper\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.CategoricalMolecularInput.validate_smiles","title":"validate_smiles(categories)
classmethod
","text":"validates that categories are valid smiles. Note that this check can only be executed when rdkit is available.
Parameters:
Name Type Description Defaultcategories
List[str]
List of smiles
requiredExceptions:
Type DescriptionValueError
when string is not a smiles
Returns:
Type DescriptionList[str]
List of the smiles
Source code inbofire/data_models/features/molecular.py
@field_validator(\"categories\")\n@classmethod\ndef validate_smiles(cls, categories: Sequence[str]):\n \"\"\"validates that categories are valid smiles. Note that this check can only\n be executed when rdkit is available.\n\n Args:\n categories (List[str]): List of smiles\n\n Raises:\n ValueError: when string is not a smiles\n\n Returns:\n List[str]: List of the smiles\n \"\"\"\n # check on rdkit availability:\n try:\n smiles2mol(categories[0])\n except NameError:\n warnings.warn(\"rdkit not installed, categories cannot be validated.\")\n return categories\n\n for cat in categories:\n smiles2mol(cat)\n return categories\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput","title":" MolecularInput (Input)
","text":"Source code in bofire/data_models/features/molecular.py
class MolecularInput(Input):\n type: Literal[\"MolecularInput\"] = \"MolecularInput\"\n # order_id: ClassVar[int] = 6\n order_id: ClassVar[int] = 4\n\n @staticmethod\n def valid_transform_types() -> List[AnyMolFeatures]:\n return [Fingerprints, FingerprintsFragments, Fragments, MordredDescriptors] # type: ignore\n\n def validate_experimental(\n self, values: pd.Series, strict: bool = False\n ) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n\n return values\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n return values\n\n def is_fixed(self) -> bool:\n return False\n\n def fixed_value(self, transform_type: Optional[AnyMolFeatures] = None) -> None:\n return None\n\n def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n raise ValueError(\"Sampling not supported for `MolecularInput`\")\n\n def get_bounds(\n self,\n transform_type: AnyMolFeatures,\n values: pd.Series,\n reference_value: Optional[str] = None,\n ) -> Tuple[List[float], List[float]]:\n \"\"\"\n Calculates the lower and upper bounds for the feature based on the given transform type and values.\n\n Args:\n transform_type (AnyMolFeatures): The type of transformation to apply to the data.\n values (pd.Series): The actual data over which the lower and upper bounds are calculated.\n reference_value (Optional[str], optional): The reference value for the transformation. Not used here.\n Defaults to None.\n\n Returns:\n Tuple[List[float], List[float]]: A tuple containing the lower and upper bounds of the transformed data.\n\n Raises:\n NotImplementedError: Raised when `values` is None, as it is currently required for `MolecularInput`.\n \"\"\"\n if values is None:\n raise NotImplementedError(\n \"`values` is currently required for `MolecularInput`\"\n )\n else:\n data = self.to_descriptor_encoding(transform_type, values)\n\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n\n return lower, upper\n\n def to_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.Series\n ) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n descriptor_values = transform_type.get_descriptor_values(values)\n\n descriptor_values.columns = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n descriptor_values.index = values.index\n\n return descriptor_values\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Method to return the fixed value in case of a fixed feature.
Returns:
Type DescriptionUnion[None,str,float]
None in case the feature is not fixed, else the fixed value.
Source code inbofire/data_models/features/molecular.py
def fixed_value(self, transform_type: Optional[AnyMolFeatures] = None) -> None:\n return None\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.get_bounds","title":"get_bounds(self, transform_type, values, reference_value=None)
","text":"Calculates the lower and upper bounds for the feature based on the given transform type and values.
Parameters:
Name Type Description Defaulttransform_type
AnyMolFeatures
The type of transformation to apply to the data.
requiredvalues
pd.Series
The actual data over which the lower and upper bounds are calculated.
requiredreference_value
Optional[str]
The reference value for the transformation. Not used here. Defaults to None.
None
Returns:
Type DescriptionTuple[List[float], List[float]]
A tuple containing the lower and upper bounds of the transformed data.
Exceptions:
Type DescriptionNotImplementedError
Raised when values
is None, as it is currently required for MolecularInput
.
bofire/data_models/features/molecular.py
def get_bounds(\n self,\n transform_type: AnyMolFeatures,\n values: pd.Series,\n reference_value: Optional[str] = None,\n) -> Tuple[List[float], List[float]]:\n \"\"\"\n Calculates the lower and upper bounds for the feature based on the given transform type and values.\n\n Args:\n transform_type (AnyMolFeatures): The type of transformation to apply to the data.\n values (pd.Series): The actual data over which the lower and upper bounds are calculated.\n reference_value (Optional[str], optional): The reference value for the transformation. Not used here.\n Defaults to None.\n\n Returns:\n Tuple[List[float], List[float]]: A tuple containing the lower and upper bounds of the transformed data.\n\n Raises:\n NotImplementedError: Raised when `values` is None, as it is currently required for `MolecularInput`.\n \"\"\"\n if values is None:\n raise NotImplementedError(\n \"`values` is currently required for `MolecularInput`\"\n )\n else:\n data = self.to_descriptor_encoding(transform_type, values)\n\n lower = data.min(axis=0).values.tolist()\n upper = data.max(axis=0).values.tolist()\n\n return lower, upper\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.is_fixed","title":"is_fixed(self)
","text":"Indicates if a variable is set to a fixed value.
Returns:
Type Descriptionbool
True if fixed, els False.
Source code inbofire/data_models/features/molecular.py
def is_fixed(self) -> bool:\n return False\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.sample","title":"sample(self, n, seed=None)
","text":"Sample a series of allowed values.
Parameters:
Name Type Description Defaultn
int
Number of samples
requiredReturns:
Type Descriptionpd.Series
Sampled values.
Source code inbofire/data_models/features/molecular.py
def sample(self, n: int, seed: Optional[int] = None) -> pd.Series:\n raise ValueError(\"Sampling not supported for `MolecularInput`\")\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.to_descriptor_encoding","title":"to_descriptor_encoding(self, transform_type, values)
","text":"Converts values to descriptor encoding.
Parameters:
Name Type Description Defaultvalues
pd.Series
Values to transform.
requiredReturns:
Type Descriptionpd.DataFrame
Descriptor encoded dataframe.
Source code inbofire/data_models/features/molecular.py
def to_descriptor_encoding(\n self, transform_type: AnyMolFeatures, values: pd.Series\n) -> pd.DataFrame:\n \"\"\"Converts values to descriptor encoding.\n\n Args:\n values (pd.Series): Values to transform.\n\n Returns:\n pd.DataFrame: Descriptor encoded dataframe.\n \"\"\"\n descriptor_values = transform_type.get_descriptor_values(values)\n\n descriptor_values.columns = [\n get_encoded_name(self.key, d) for d in transform_type.get_descriptor_names()\n ]\n descriptor_values.index = values.index\n\n return descriptor_values\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Abstract method to validate the suggested candidates
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with candidates
requiredReturns:
Type Descriptionpd.Series
The passed dataFrame with candidates
Source code inbofire/data_models/features/molecular.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.molecular.MolecularInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Abstract method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Returns:
Type Descriptionpd.Series
The passed dataFrame with experiments
Source code inbofire/data_models/features/molecular.py
def validate_experimental(\n self, values: pd.Series, strict: bool = False\n) -> pd.Series:\n values = values.map(str)\n for smi in values:\n smiles2mol(smi)\n\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.numerical","title":"numerical
","text":""},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput","title":" NumericalInput (Input)
","text":"Abstract base class for all numerical (ordinal) input features.
Source code inbofire/data_models/features/numerical.py
class NumericalInput(Input):\n \"\"\"Abstract base class for all numerical (ordinal) input features.\"\"\"\n\n unit: Optional[str] = None\n\n @staticmethod\n def valid_transform_types() -> List:\n return []\n\n def to_unit_range(\n self, values: Union[pd.Series, np.ndarray], use_real_bounds: bool = False\n ) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert to the unit range between 0 and 1.\n\n Args:\n values (pd.Series): values to be transformed\n use_real_bounds (bool, optional): if True, use the bounds from the actual values else the bounds from the feature.\n Defaults to False.\n\n Raises:\n ValueError: If lower_bound == upper bound an error is raised\n\n Returns:\n pd.Series: transformed values.\n \"\"\"\n if use_real_bounds:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n lower = lower[0]\n upper = upper[0]\n else:\n lower, upper = self.lower_bound, self.upper_bound # type: ignore\n if lower == upper:\n raise ValueError(\"Fixed feature cannot be transformed to unit range.\")\n valrange = upper - lower\n return (values - lower) / valrange\n\n def from_unit_range(\n self, values: Union[pd.Series, np.ndarray]\n ) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert from unit range.\n\n Args:\n values (pd.Series): values to transform from.\n\n Raises:\n ValueError: if the feature is fixed raise a value error.\n\n Returns:\n pd.Series: _description_\n \"\"\"\n if self.is_fixed():\n raise ValueError(\"Fixed feature cannot be transformed from unit range.\")\n valrange = self.upper_bound - self.lower_bound # type: ignore\n return (values * valrange) + self.lower_bound # type: ignore\n\n def is_fixed(self):\n \"\"\"Method to check if the feature is fixed\n\n Returns:\n Boolean: True when the feature is fixed, false otherwise.\n \"\"\"\n return self.lower_bound == self.upper_bound # type: ignore\n\n def fixed_value(\n self, transform_type: Optional[TTransform] = None\n ) -> Union[None, List[float]]:\n \"\"\"Method to get the value to which the feature is fixed\n\n Returns:\n Float: Return the feature value or None if the feature is not fixed.\n \"\"\"\n assert transform_type is None\n if self.is_fixed():\n return [self.lower_bound] # type: ignore\n else:\n return None\n\n def validate_experimental(self, values: pd.Series, strict=False) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not.\n Defaults to False.\n\n Raises:\n ValueError: when a value is not numerical\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n values = values.astype(\"float64\")\n if strict:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n if lower == upper:\n raise ValueError(\n f\"No variation present or planned for feature {self.key}. Remove it.\"\n )\n return values\n\n def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Validate the suggested candidates for the feature.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Error is raised when one of the values is not numerical.\n\n Returns:\n pd.Series: the original provided candidates\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.fixed_value","title":"fixed_value(self, transform_type=None)
","text":"Method to get the value to which the feature is fixed
Returns:
Type DescriptionFloat
Return the feature value or None if the feature is not fixed.
Source code inbofire/data_models/features/numerical.py
def fixed_value(\n self, transform_type: Optional[TTransform] = None\n) -> Union[None, List[float]]:\n \"\"\"Method to get the value to which the feature is fixed\n\n Returns:\n Float: Return the feature value or None if the feature is not fixed.\n \"\"\"\n assert transform_type is None\n if self.is_fixed():\n return [self.lower_bound] # type: ignore\n else:\n return None\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.from_unit_range","title":"from_unit_range(self, values)
","text":"Convert from unit range.
Parameters:
Name Type Description Defaultvalues
pd.Series
values to transform from.
requiredExceptions:
Type DescriptionValueError
if the feature is fixed raise a value error.
Returns:
Type Descriptionpd.Series
description
Source code inbofire/data_models/features/numerical.py
def from_unit_range(\n self, values: Union[pd.Series, np.ndarray]\n) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert from unit range.\n\n Args:\n values (pd.Series): values to transform from.\n\n Raises:\n ValueError: if the feature is fixed raise a value error.\n\n Returns:\n pd.Series: _description_\n \"\"\"\n if self.is_fixed():\n raise ValueError(\"Fixed feature cannot be transformed from unit range.\")\n valrange = self.upper_bound - self.lower_bound # type: ignore\n return (values * valrange) + self.lower_bound # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.is_fixed","title":"is_fixed(self)
","text":"Method to check if the feature is fixed
Returns:
Type DescriptionBoolean
True when the feature is fixed, false otherwise.
Source code inbofire/data_models/features/numerical.py
def is_fixed(self):\n \"\"\"Method to check if the feature is fixed\n\n Returns:\n Boolean: True when the feature is fixed, false otherwise.\n \"\"\"\n return self.lower_bound == self.upper_bound # type: ignore\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.to_unit_range","title":"to_unit_range(self, values, use_real_bounds=False)
","text":"Convert to the unit range between 0 and 1.
Parameters:
Name Type Description Defaultvalues
pd.Series
values to be transformed
requireduse_real_bounds
bool
if True, use the bounds from the actual values else the bounds from the feature. Defaults to False.
False
Exceptions:
Type DescriptionValueError
If lower_bound == upper bound an error is raised
Returns:
Type Descriptionpd.Series
transformed values.
Source code inbofire/data_models/features/numerical.py
def to_unit_range(\n self, values: Union[pd.Series, np.ndarray], use_real_bounds: bool = False\n) -> Union[pd.Series, np.ndarray]:\n \"\"\"Convert to the unit range between 0 and 1.\n\n Args:\n values (pd.Series): values to be transformed\n use_real_bounds (bool, optional): if True, use the bounds from the actual values else the bounds from the feature.\n Defaults to False.\n\n Raises:\n ValueError: If lower_bound == upper bound an error is raised\n\n Returns:\n pd.Series: transformed values.\n \"\"\"\n if use_real_bounds:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n lower = lower[0]\n upper = upper[0]\n else:\n lower, upper = self.lower_bound, self.upper_bound # type: ignore\n if lower == upper:\n raise ValueError(\"Fixed feature cannot be transformed to unit range.\")\n valrange = upper - lower\n return (values - lower) / valrange\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.validate_candidental","title":"validate_candidental(self, values)
","text":"Validate the suggested candidates for the feature.
Parameters:
Name Type Description Defaultvalues
pd.Series
suggested candidates for the feature
requiredExceptions:
Type DescriptionValueError
Error is raised when one of the values is not numerical.
Returns:
Type Descriptionpd.Series
the original provided candidates
Source code inbofire/data_models/features/numerical.py
def validate_candidental(self, values: pd.Series) -> pd.Series:\n \"\"\"Validate the suggested candidates for the feature.\n\n Args:\n values (pd.Series): suggested candidates for the feature\n\n Raises:\n ValueError: Error is raised when one of the values is not numerical.\n\n Returns:\n pd.Series: the original provided candidates\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n return values\n
"},{"location":"ref-features/#bofire.data_models.features.numerical.NumericalInput.validate_experimental","title":"validate_experimental(self, values, strict=False)
","text":"Method to validate the experimental dataFrame
Parameters:
Name Type Description Defaultvalues
pd.Series
A dataFrame with experiments
requiredstrict
bool
Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not. Defaults to False.
False
Exceptions:
Type DescriptionValueError
when a value is not numerical
ValueError
when there is no variation in a feature provided by the experimental data
Returns:
Type Descriptionpd.Series
A dataFrame with experiments
Source code inbofire/data_models/features/numerical.py
def validate_experimental(self, values: pd.Series, strict=False) -> pd.Series:\n \"\"\"Method to validate the experimental dataFrame\n\n Args:\n values (pd.Series): A dataFrame with experiments\n strict (bool, optional): Boolean to distinguish if the occurence of fixed features in the dataset should be considered or not.\n Defaults to False.\n\n Raises:\n ValueError: when a value is not numerical\n ValueError: when there is no variation in a feature provided by the experimental data\n\n Returns:\n pd.Series: A dataFrame with experiments\n \"\"\"\n try:\n values = pd.to_numeric(values, errors=\"raise\").astype(\"float64\")\n except ValueError:\n raise ValueError(\n f\"not all values of input feature `{self.key}` are numerical\"\n )\n values = values.astype(\"float64\")\n if strict:\n lower, upper = self.get_bounds(transform_type=None, values=values)\n if lower == upper:\n raise ValueError(\n f\"No variation present or planned for feature {self.key}. Remove it.\"\n )\n return values\n
"},{"location":"ref-objectives/","title":"Domain","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.categorical","title":"categorical
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective","title":" ConstrainedCategoricalObjective (ConstrainedObjective, Objective)
","text":"Compute the categorical objective value as:
Po where P is an [n, c] matrix where each row is a probability vector\n(P[i, :].sum()=1 for all i) and o is a vector of size [c] of objective values\n
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
desirability
list
list of values of size c (c is number of categories) such that the i-th entry is in {True, False}
Source code inbofire/data_models/objectives/categorical.py
class ConstrainedCategoricalObjective(ConstrainedObjective, Objective):\n \"\"\"Compute the categorical objective value as:\n\n Po where P is an [n, c] matrix where each row is a probability vector\n (P[i, :].sum()=1 for all i) and o is a vector of size [c] of objective values\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n desirability (list): list of values of size c (c is number of categories) such that the i-th entry is in {True, False}\n \"\"\"\n\n w: TWeight = 1.0\n categories: CategoryVals\n desirability: List[bool]\n type: Literal[\"ConstrainedCategoricalObjective\"] = \"ConstrainedCategoricalObjective\"\n\n @model_validator(mode=\"after\")\n def validate_desireability(self):\n \"\"\"validates that categories have unique names\n\n Args:\n categories (List[str]): List or tuple of category names\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n Tuple[str]: Tuple of the categories\n \"\"\"\n if len(self.desirability) != len(self.categories):\n raise ValueError(\n \"number of categories differs from number of desirabilities\"\n )\n return self\n\n def to_dict(self) -> Dict:\n \"\"\"Returns the categories and corresponding objective values as dictionary\"\"\"\n return dict(zip(self.categories, self.desirability))\n\n def to_dict_label(self) -> Dict:\n \"\"\"Returns the catergories and label location of categories\"\"\"\n return {c: i for i, c in enumerate(self.categories)}\n\n def from_dict_label(self) -> Dict:\n \"\"\"Returns the label location and the categories\"\"\"\n d = self.to_dict_label()\n return dict(zip(d.values(), d.keys()))\n\n def __call__(\n self, x: Union[pd.Series, np.ndarray]\n ) -> Union[pd.Series, np.ndarray, float]:\n \"\"\"The call function returning a probabilistic reward for x.\n\n Args:\n x (np.ndarray): A matrix of x values\n\n Returns:\n np.ndarray: A reward calculated as inner product of probabilities and feasible objectives.\n \"\"\"\n return np.dot(x, np.array(self.desirability))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a probabilistic reward for x.
Parameters:
Name Type Description Defaultx
np.ndarray
A matrix of x values
requiredReturns:
Type Descriptionnp.ndarray
A reward calculated as inner product of probabilities and feasible objectives.
Source code inbofire/data_models/objectives/categorical.py
def __call__(\n self, x: Union[pd.Series, np.ndarray]\n) -> Union[pd.Series, np.ndarray, float]:\n \"\"\"The call function returning a probabilistic reward for x.\n\n Args:\n x (np.ndarray): A matrix of x values\n\n Returns:\n np.ndarray: A reward calculated as inner product of probabilities and feasible objectives.\n \"\"\"\n return np.dot(x, np.array(self.desirability))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.from_dict_label","title":"from_dict_label(self)
","text":"Returns the label location and the categories
Source code inbofire/data_models/objectives/categorical.py
def from_dict_label(self) -> Dict:\n \"\"\"Returns the label location and the categories\"\"\"\n d = self.to_dict_label()\n return dict(zip(d.values(), d.keys()))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.to_dict","title":"to_dict(self)
","text":"Returns the categories and corresponding objective values as dictionary
Source code inbofire/data_models/objectives/categorical.py
def to_dict(self) -> Dict:\n \"\"\"Returns the categories and corresponding objective values as dictionary\"\"\"\n return dict(zip(self.categories, self.desirability))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.to_dict_label","title":"to_dict_label(self)
","text":"Returns the catergories and label location of categories
Source code inbofire/data_models/objectives/categorical.py
def to_dict_label(self) -> Dict:\n \"\"\"Returns the catergories and label location of categories\"\"\"\n return {c: i for i, c in enumerate(self.categories)}\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.categorical.ConstrainedCategoricalObjective.validate_desireability","title":"validate_desireability(self)
","text":"validates that categories have unique names
Parameters:
Name Type Description Defaultcategories
List[str]
List or tuple of category names
requiredExceptions:
Type DescriptionValueError
when categories do not match objective categories
Returns:
Type DescriptionTuple[str]
Tuple of the categories
Source code inbofire/data_models/objectives/categorical.py
@model_validator(mode=\"after\")\ndef validate_desireability(self):\n \"\"\"validates that categories have unique names\n\n Args:\n categories (List[str]): List or tuple of category names\n\n Raises:\n ValueError: when categories do not match objective categories\n\n Returns:\n Tuple[str]: Tuple of the categories\n \"\"\"\n if len(self.desirability) != len(self.categories):\n raise ValueError(\n \"number of categories differs from number of desirabilities\"\n )\n return self\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity","title":"identity
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.identity.IdentityObjective","title":" IdentityObjective (Objective)
","text":"An objective returning the identity as reward. The return can be scaled, when a lower and upper bound are provided.
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective
bounds
Tuple[float]
Bound for normalizing the objective between zero and one. Defaults to (0,1).
Source code inbofire/data_models/objectives/identity.py
class IdentityObjective(Objective):\n \"\"\"An objective returning the identity as reward.\n The return can be scaled, when a lower and upper bound are provided.\n\n Attributes:\n w (float): float between zero and one for weighting the objective\n bounds (Tuple[float], optional): Bound for normalizing the objective between zero and one. Defaults to (0,1).\n \"\"\"\n\n type: Literal[\"IdentityObjective\"] = \"IdentityObjective\"\n w: TWeight = 1\n bounds: Tuple[float, float] = (0, 1)\n\n @property\n def lower_bound(self) -> float:\n return self.bounds[0]\n\n @property\n def upper_bound(self) -> float:\n return self.bounds[1]\n\n @field_validator(\"bounds\")\n @classmethod\n def validate_lower_upper(cls, bounds):\n \"\"\"Validation function to ensure that lower bound is always greater the upper bound\n\n Args:\n values (Dict): The attributes of the class\n\n Raises:\n ValueError: when a lower bound higher than the upper bound is passed\n\n Returns:\n Dict: The attributes of the class\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.IdentityObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a reward for passed x values
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
The identity as reward, might be normalized to the passed lower and upper bounds
Source code inbofire/data_models/objectives/identity.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.IdentityObjective.validate_lower_upper","title":"validate_lower_upper(bounds)
classmethod
","text":"Validation function to ensure that lower bound is always greater the upper bound
Parameters:
Name Type Description Defaultvalues
Dict
The attributes of the class
requiredExceptions:
Type DescriptionValueError
when a lower bound higher than the upper bound is passed
Returns:
Type DescriptionDict
The attributes of the class
Source code inbofire/data_models/objectives/identity.py
@field_validator(\"bounds\")\n@classmethod\ndef validate_lower_upper(cls, bounds):\n \"\"\"Validation function to ensure that lower bound is always greater the upper bound\n\n Args:\n values (Dict): The attributes of the class\n\n Raises:\n ValueError: when a lower bound higher than the upper bound is passed\n\n Returns:\n Dict: The attributes of the class\n \"\"\"\n if bounds[0] > bounds[1]:\n raise ValueError(\n f\"lower bound must be <= upper bound, got {bounds[0]} > {bounds[1]}\"\n )\n return bounds\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.MaximizeObjective","title":" MaximizeObjective (IdentityObjective)
","text":"Child class from the identity function without modifications, since the parent class is already defined as maximization
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective
bounds
Tuple[float]
Bound for normalizing the objective between zero and one. Defaults to (0,1).
Source code inbofire/data_models/objectives/identity.py
class MaximizeObjective(IdentityObjective):\n \"\"\"Child class from the identity function without modifications, since the parent class is already defined as maximization\n\n Attributes:\n w (float): float between zero and one for weighting the objective\n bounds (Tuple[float], optional): Bound for normalizing the objective between zero and one. Defaults to (0,1).\n \"\"\"\n\n type: Literal[\"MaximizeObjective\"] = \"MaximizeObjective\"\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.MinimizeObjective","title":" MinimizeObjective (IdentityObjective)
","text":"Class returning the negative identity as reward.
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective
bounds
Tuple[float]
Bound for normalizing the objective between zero and one. Defaults to (0,1).
Source code inbofire/data_models/objectives/identity.py
class MinimizeObjective(IdentityObjective):\n \"\"\"Class returning the negative identity as reward.\n\n Attributes:\n w (float): float between zero and one for weighting the objective\n bounds (Tuple[float], optional): Bound for normalizing the objective between zero and one. Defaults to (0,1).\n \"\"\"\n\n type: Literal[\"MinimizeObjective\"] = \"MinimizeObjective\"\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The negative identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return -1.0 * (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.identity.MinimizeObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a reward for passed x values
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
The negative identity as reward, might be normalized to the passed lower and upper bounds
Source code inbofire/data_models/objectives/identity.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The negative identity as reward, might be normalized to the passed lower and upper bounds\n \"\"\"\n return -1.0 * (x - self.lower_bound) / (self.upper_bound - self.lower_bound)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.objective","title":"objective
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.objective.ConstrainedObjective","title":" ConstrainedObjective
","text":"This abstract class offers a convenience routine for transforming sigmoid based objectives to botorch output constraints.
Source code inbofire/data_models/objectives/objective.py
class ConstrainedObjective:\n \"\"\"This abstract class offers a convenience routine for transforming sigmoid based objectives to botorch output constraints.\"\"\"\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.objective.Objective","title":" Objective (BaseModel)
","text":"The base class for all objectives
Source code inbofire/data_models/objectives/objective.py
class Objective(BaseModel):\n \"\"\"The base class for all objectives\"\"\"\n\n type: str\n\n @abstractmethod\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"Abstract method to define the call function for the class Objective\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The desirability of the passed x values\n \"\"\"\n pass\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.objective.Objective.__call__","title":"__call__(self, x)
special
","text":"Abstract method to define the call function for the class Objective
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
The desirability of the passed x values
Source code inbofire/data_models/objectives/objective.py
@abstractmethod\ndef __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"Abstract method to define the call function for the class Objective\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: The desirability of the passed x values\n \"\"\"\n pass\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid","title":"sigmoid
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MaximizeSigmoidObjective","title":" MaximizeSigmoidObjective (SigmoidObjective)
","text":"Class for a maximizing sigmoid objective
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
tp
float
Turning point of the sigmoid function.
Source code inbofire/data_models/objectives/sigmoid.py
class MaximizeSigmoidObjective(SigmoidObjective):\n \"\"\"Class for a maximizing sigmoid objective\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n tp (float): Turning point of the sigmoid function.\n\n \"\"\"\n\n type: Literal[\"MaximizeSigmoidObjective\"] = \"MaximizeSigmoidObjective\"\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MaximizeSigmoidObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a sigmoid shaped reward for passed x values.
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.
Source code inbofire/data_models/objectives/sigmoid.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MinimizeSigmoidObjective","title":" MinimizeSigmoidObjective (SigmoidObjective)
","text":"Class for a minimizing a sigmoid objective
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
tp
float
Turning point of the sigmoid function.
Source code inbofire/data_models/objectives/sigmoid.py
class MinimizeSigmoidObjective(SigmoidObjective):\n \"\"\"Class for a minimizing a sigmoid objective\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n tp (float): Turning point of the sigmoid function.\n \"\"\"\n\n type: Literal[\"MinimizeSigmoidObjective\"] = \"MinimizeSigmoidObjective\"\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 - 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.MinimizeSigmoidObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a sigmoid shaped reward for passed x values.
Parameters:
Name Type Description Defaultx
np.ndarray
An array of x values
requiredReturns:
Type Descriptionnp.ndarray
A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.
Source code inbofire/data_models/objectives/sigmoid.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a sigmoid shaped reward for passed x values.\n\n Args:\n x (np.ndarray): An array of x values\n\n Returns:\n np.ndarray: A reward calculated with a sigmoid function. The stepness and the tipping point can be modified via passed arguments.\n \"\"\"\n return 1 - 1 / (1 + np.exp(-1 * self.steepness * (x - self.tp)))\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.sigmoid.SigmoidObjective","title":" SigmoidObjective (Objective, ConstrainedObjective)
","text":"Base class for all sigmoid shaped objectives
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
tp
float
Turning point of the sigmoid function.
Source code inbofire/data_models/objectives/sigmoid.py
class SigmoidObjective(Objective, ConstrainedObjective):\n \"\"\"Base class for all sigmoid shaped objectives\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n tp (float): Turning point of the sigmoid function.\n \"\"\"\n\n steepness: TGt0\n tp: float\n w: TWeight = 1\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.target","title":"target
","text":""},{"location":"ref-objectives/#bofire.data_models.objectives.target.CloseToTargetObjective","title":" CloseToTargetObjective (Objective)
","text":"Optimize towards a target value. It can be used as objective in multiobjective scenarios.
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
target_value
float
target value that should be reached.
exponent
float
the exponent of the expression.
Source code inbofire/data_models/objectives/target.py
class CloseToTargetObjective(Objective):\n \"\"\"Optimize towards a target value. It can be used as objective\n in multiobjective scenarios.\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n target_value (float): target value that should be reached.\n exponent (float): the exponent of the expression.\n \"\"\"\n\n type: Literal[\"CloseToTargetObjective\"] = \"CloseToTargetObjective\"\n w: TWeight = 1\n target_value: float\n exponent: float\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n return -1 * (np.abs(x - self.target_value) ** self.exponent)\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.target.TargetObjective","title":" TargetObjective (Objective, ConstrainedObjective)
","text":"Class for objectives for optimizing towards a target value
Attributes:
Name Type Descriptionw
float
float between zero and one for weighting the objective.
target_value
float
target value that should be reached.
tolerance
float
Tolerance for reaching the target. Has to be greater than zero.
steepness
float
Steepness of the sigmoid function. Has to be greater than zero.
Source code inbofire/data_models/objectives/target.py
class TargetObjective(Objective, ConstrainedObjective):\n \"\"\"Class for objectives for optimizing towards a target value\n\n Attributes:\n w (float): float between zero and one for weighting the objective.\n target_value (float): target value that should be reached.\n tolerance (float): Tolerance for reaching the target. Has to be greater than zero.\n steepness (float): Steepness of the sigmoid function. Has to be greater than zero.\n\n \"\"\"\n\n type: Literal[\"TargetObjective\"] = \"TargetObjective\"\n w: TWeight = 1\n target_value: float\n tolerance: TGe0\n steepness: TGt0\n\n def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values.\n\n Args:\n x (np.array): An array of x values\n\n Returns:\n np.array: An array of reward values calculated by the product of two sigmoidal shaped functions resulting in a maximum at the target value.\n \"\"\"\n return (\n 1\n / (\n 1\n + np.exp(\n -1 * self.steepness * (x - (self.target_value - self.tolerance))\n )\n )\n * (\n 1\n - 1\n / (\n 1.0\n + np.exp(\n -1 * self.steepness * (x - (self.target_value + self.tolerance))\n )\n )\n )\n )\n
"},{"location":"ref-objectives/#bofire.data_models.objectives.target.TargetObjective.__call__","title":"__call__(self, x)
special
","text":"The call function returning a reward for passed x values.
Parameters:
Name Type Description Defaultx
np.array
An array of x values
requiredReturns:
Type Descriptionnp.array
An array of reward values calculated by the product of two sigmoidal shaped functions resulting in a maximum at the target value.
Source code inbofire/data_models/objectives/target.py
def __call__(self, x: Union[pd.Series, np.ndarray]) -> Union[pd.Series, np.ndarray]:\n \"\"\"The call function returning a reward for passed x values.\n\n Args:\n x (np.array): An array of x values\n\n Returns:\n np.array: An array of reward values calculated by the product of two sigmoidal shaped functions resulting in a maximum at the target value.\n \"\"\"\n return (\n 1\n / (\n 1\n + np.exp(\n -1 * self.steepness * (x - (self.target_value - self.tolerance))\n )\n )\n * (\n 1\n - 1\n / (\n 1.0\n + np.exp(\n -1 * self.steepness * (x - (self.target_value + self.tolerance))\n )\n )\n )\n )\n
"},{"location":"ref-utils/","title":"Utils","text":""},{"location":"ref-utils/#bofire.utils.cheminformatics","title":"cheminformatics
","text":""},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2fingerprints","title":"smiles2fingerprints(smiles, bond_radius=5, n_bits=2048)
","text":"Transforms a list of smiles to an array of morgan fingerprints.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredbond_radius
int
Bond radius to use. Defaults to 5.
5
n_bits
int
Number of bits. Defaults to 2048.
2048
Returns:
Type Descriptionnp.ndarray
Numpy array holding the fingerprints
Source code inbofire/utils/cheminformatics.py
def smiles2fingerprints(\n smiles: List[str], bond_radius: int = 5, n_bits: int = 2048\n) -> np.ndarray:\n \"\"\"Transforms a list of smiles to an array of morgan fingerprints.\n\n Args:\n smiles (List[str]): List of smiles\n bond_radius (int, optional): Bond radius to use. Defaults to 5.\n n_bits (int, optional): Number of bits. Defaults to 2048.\n\n Returns:\n np.ndarray: Numpy array holding the fingerprints\n \"\"\"\n rdkit_mols = [smiles2mol(m) for m in smiles]\n fps = [\n AllChem.GetMorganFingerprintAsBitVect( # type: ignore\n mol, radius=bond_radius, nBits=n_bits\n )\n for mol in rdkit_mols\n ]\n\n return np.asarray(fps)\n
"},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2fragments","title":"smiles2fragments(smiles, fragments_list=None)
","text":"Transforms smiles to an array of fragments.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requiredReturns:
Type Descriptionnp.ndarray
Array holding the fragment information.
Source code inbofire/utils/cheminformatics.py
def smiles2fragments(\n smiles: List[str], fragments_list: Optional[List[str]] = None\n) -> np.ndarray:\n \"\"\"Transforms smiles to an array of fragments.\n\n Args:\n smiles (List[str]): List of smiles\n\n Returns:\n np.ndarray: Array holding the fragment information.\n \"\"\"\n rdkit_fragment_list = [\n item for item in Descriptors.descList if item[0].startswith(\"fr_\")\n ]\n if fragments_list is None:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list}\n else:\n fragments = {d[0]: d[1] for d in rdkit_fragment_list if d[0] in fragments_list}\n\n frags = np.zeros((len(smiles), len(fragments)))\n for i, smi in enumerate(smiles):\n mol = smiles2mol(smi)\n features = [fragments[d](mol) for d in fragments]\n frags[i, :] = features\n\n return frags\n
"},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2mol","title":"smiles2mol(smiles)
","text":"Transforms a smiles string to an rdkit mol object.
Parameters:
Name Type Description Defaultsmiles
str
Smiles string.
requiredExceptions:
Type DescriptionValueError
If string is not a valid smiles.
Returns:
Type Descriptionrdkit.Mol
rdkit.mol object
Source code inbofire/utils/cheminformatics.py
def smiles2mol(smiles: str):\n \"\"\"Transforms a smiles string to an rdkit mol object.\n\n Args:\n smiles (str): Smiles string.\n\n Raises:\n ValueError: If string is not a valid smiles.\n\n Returns:\n rdkit.Mol: rdkit.mol object\n \"\"\"\n mol = MolFromSmiles(smiles)\n if mol is None:\n raise ValueError(f\"{smiles} is not a valid smiles string.\")\n return mol\n
"},{"location":"ref-utils/#bofire.utils.cheminformatics.smiles2mordred","title":"smiles2mordred(smiles, descriptors_list)
","text":"Transforms list of smiles to mordred moelcular descriptors.
Parameters:
Name Type Description Defaultsmiles
List[str]
List of smiles
requireddescriptors_list
List[str]
List of desired mordred descriptors
requiredReturns:
Type Descriptionnp.ndarray
Array holding the mordred moelcular descriptors.
Source code inbofire/utils/cheminformatics.py
def smiles2mordred(smiles: List[str], descriptors_list: List[str]) -> np.ndarray:\n \"\"\"Transforms list of smiles to mordred moelcular descriptors.\n\n Args:\n smiles (List[str]): List of smiles\n descriptors_list (List[str]): List of desired mordred descriptors\n\n Returns:\n np.ndarray: Array holding the mordred moelcular descriptors.\n \"\"\"\n mols = [smiles2mol(smi) for smi in smiles]\n\n calc = Calculator(descriptors, ignore_3D=True)\n calc.descriptors = [d for d in calc.descriptors if str(d) in descriptors_list]\n\n descriptors_df = calc.pandas(mols)\n nan_list = [\n pd.to_numeric(descriptors_df[col], errors=\"coerce\").isnull().values.any()\n for col in descriptors_df.columns\n ]\n if any(nan_list):\n raise ValueError(\n f\"Found NaN values in descriptors {list(descriptors_df.columns[nan_list])}\"\n )\n\n return descriptors_df.astype(float).values\n
"},{"location":"ref-utils/#bofire.utils.doe","title":"doe
","text":""},{"location":"ref-utils/#bofire.utils.doe.ff2n","title":"ff2n(n_factors)
","text":"Computes the full factorial design for a given number of factors.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredReturns:
Type Descriptionndarray
The full factorial design.
Source code inbofire/utils/doe.py
def ff2n(n_factors: int) -> np.ndarray:\n \"\"\"Computes the full factorial design for a given number of factors.\n\n Args:\n n_factors: The number of factors.\n\n Returns:\n The full factorial design.\n \"\"\"\n return np.array(list(itertools.product([-1, 1], repeat=n_factors)))\n
"},{"location":"ref-utils/#bofire.utils.doe.fracfact","title":"fracfact(gen)
","text":"Computes the fractional factorial design for a given generator.
Parameters:
Name Type Description Defaultgen
The generator.
requiredReturns:
Type Descriptionndarray
The fractional factorial design.
Source code inbofire/utils/doe.py
def fracfact(gen) -> np.ndarray:\n \"\"\"Computes the fractional factorial design for a given generator.\n\n Args:\n gen: The generator.\n\n Returns:\n The fractional factorial design.\n \"\"\"\n gen = validate_generator(n_factors=gen.count(\" \") + 1, generator=gen)\n\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", gen) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # Check if there are \"-\" operators in gen\n idx_negative = [\n i for i, item in enumerate(gen.split(\" \")) if item[0] == \"-\"\n ] # remove empty strings\n\n # Fill in design with two level factorial design\n H1 = ff2n(len(idx_main))\n H = np.zeros((H1.shape[0], len(lengthes)))\n H[:, idx_main] = H1\n\n # Recognize combinations and fill in the rest of matrix H2 with the proper\n # products\n for k in idx_combi:\n # For lowercase letters\n xx = np.array([ord(c) for c in generators[k]]) - 97\n\n H[:, k] = np.prod(H1[:, xx], axis=1)\n\n # Update design if gen includes \"-\" operator\n if len(idx_negative) > 0:\n H[:, idx_negative] *= -1\n\n # Return the fractional factorial design\n return H\n
"},{"location":"ref-utils/#bofire.utils.doe.get_alias_structure","title":"get_alias_structure(gen, order=4)
","text":"Computes the alias structure of the design matrix. Works only for generators with positive signs.
Parameters:
Name Type Description Defaultgen
str
The generator.
requiredorder
int
The order up to wich the alias structure should be calculated. Defaults to 4.
4
Returns:
Type DescriptionList[str]
The alias structure of the design matrix.
Source code inbofire/utils/doe.py
def get_alias_structure(gen: str, order: int = 4) -> List[str]:\n \"\"\"Computes the alias structure of the design matrix. Works only for generators\n with positive signs.\n\n Args:\n gen: The generator.\n order: The order up to wich the alias structure should be calculated. Defaults to 4.\n\n Returns:\n The alias structure of the design matrix.\n \"\"\"\n design = fracfact(gen)\n\n n_experiments, n_factors = design.shape\n\n all_names = string.ascii_lowercase + \"I\"\n factors = range(n_factors)\n all_combinations = itertools.chain.from_iterable(\n (\n itertools.combinations(factors, n)\n for n in range(1, min(n_factors, order) + 1)\n )\n )\n aliases = {n_experiments * \"+\": [(26,)]} # 26 is mapped to I\n\n for combination in all_combinations:\n # positive sign\n contrast = np.prod(\n design[:, combination], axis=1\n ) # this is the product of the combination\n scontrast = \"\".join(np.where(contrast == 1, \"+\", \"-\").tolist())\n aliases[scontrast] = aliases.get(scontrast, [])\n aliases[scontrast].append(combination) # type: ignore\n\n aliases_list = []\n for alias in aliases.values():\n aliases_list.append(\n sorted(alias, key=lambda a: (len(a), a))\n ) # sort by length and then by the combination\n aliases_list = sorted(\n aliases_list, key=lambda list: ([len(a) for a in list], list)\n ) # sort by the length of the alias\n\n aliases_readable = []\n\n for alias in aliases_list:\n aliases_readable.append(\n \" = \".join([\"\".join([all_names[f] for f in a]) for a in alias])\n )\n\n return aliases_readable\n
"},{"location":"ref-utils/#bofire.utils.doe.get_confounding_matrix","title":"get_confounding_matrix(inputs, design, powers=None, interactions=None)
","text":"Analyzes the confounding of a design and returns the confounding matrix.
Only takes continuous features into account.
Parameters:
Name Type Description Defaultinputs
Inputs
Input features.
requireddesign
pd.DataFrame
Design matrix.
requiredpowers
List[int]
List of powers of the individual factors/features that should be considered. Integers has to be larger than 1. Defaults to [].
None
interactions
List[int]
List with interaction levels to be considered. Integers has to be larger than 1. Defaults to [2].
None
Returns:
Type Description_type_
description
Source code inbofire/utils/doe.py
def get_confounding_matrix(\n inputs: Inputs,\n design: pd.DataFrame,\n powers: Optional[List[int]] = None,\n interactions: Optional[List[int]] = None,\n):\n \"\"\"Analyzes the confounding of a design and returns the confounding matrix.\n\n Only takes continuous features into account.\n\n Args:\n inputs (Inputs): Input features.\n design (pd.DataFrame): Design matrix.\n powers (List[int], optional): List of powers of the individual factors/features that should be considered.\n Integers has to be larger than 1. Defaults to [].\n interactions (List[int], optional): List with interaction levels to be considered.\n Integers has to be larger than 1. Defaults to [2].\n\n Returns:\n _type_: _description_\n \"\"\"\n from sklearn.preprocessing import MinMaxScaler\n\n if len(inputs.get(CategoricalInput)) > 0:\n warnings.warn(\"Categorical input features will be ignored.\")\n\n keys = inputs.get_keys(ContinuousInput)\n scaler = MinMaxScaler(feature_range=(-1, 1))\n scaled_design = pd.DataFrame(\n data=scaler.fit_transform(design[keys]),\n columns=keys,\n )\n\n # add powers\n if powers is not None:\n for p in powers:\n assert p > 1, \"Power has to be at least of degree two.\"\n for key in keys:\n scaled_design[f\"{key}**{p}\"] = scaled_design[key] ** p\n\n # add interactions\n if interactions is None:\n interactions = [2]\n\n for i in interactions:\n assert i > 1, \"Interaction has to be at least of degree two.\"\n assert i < len(keys) + 1, f\"Interaction has to be smaller than {len(keys)+1}.\"\n for combi in itertools.combinations(keys, i):\n scaled_design[\":\".join(combi)] = scaled_design[list(combi)].prod(axis=1)\n\n return scaled_design.corr()\n
"},{"location":"ref-utils/#bofire.utils.doe.get_generator","title":"get_generator(n_factors, n_generators)
","text":"Computes a generator for a given number of factors and generators.
Parameters:
Name Type Description Defaultn_factors
int
The number of factors.
requiredn_generators
int
The number of generators.
requiredReturns:
Type Descriptionstr
The generator.
Source code inbofire/utils/doe.py
def get_generator(n_factors: int, n_generators: int) -> str:\n \"\"\"Computes a generator for a given number of factors and generators.\n\n Args:\n n_factors: The number of factors.\n n_generators: The number of generators.\n\n Returns:\n The generator.\n \"\"\"\n if n_generators == 0:\n return \" \".join(list(string.ascii_lowercase[:n_factors]))\n n_base_factors = n_factors - n_generators\n if n_generators == 1:\n if n_base_factors == 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(\n list(string.ascii_lowercase[:n_base_factors])\n + [string.ascii_lowercase[:n_base_factors]]\n )\n n_base_factors = n_factors - n_generators\n if n_base_factors - 1 < 2:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n generators = [\n \"\".join(i)\n for i in (\n itertools.combinations(\n string.ascii_lowercase[:n_base_factors], n_base_factors - 1\n )\n )\n ]\n if len(generators) > n_generators:\n generators = generators[:n_generators]\n elif (n_generators - len(generators) == 1) and (n_base_factors > 1):\n generators += [string.ascii_lowercase[:n_base_factors]]\n elif n_generators - len(generators) >= 1:\n raise ValueError(\n \"Design not possible, as main factors are confounded with each other.\"\n )\n return \" \".join(list(string.ascii_lowercase[:n_base_factors]) + generators)\n
"},{"location":"ref-utils/#bofire.utils.doe.validate_generator","title":"validate_generator(n_factors, generator)
","text":"Validates the generator and thows an error if it is not valid.
Source code inbofire/utils/doe.py
def validate_generator(n_factors: int, generator: str) -> str:\n \"\"\"Validates the generator and thows an error if it is not valid.\"\"\"\n\n if len(generator.split(\" \")) != n_factors:\n raise ValueError(\"Generator does not match the number of factors.\")\n # clean it and transform it into a list\n generators = [item for item in re.split(r\"\\-|\\s|\\+\", generator) if item]\n lengthes = [len(i) for i in generators]\n\n # Indices of single letters (main factors)\n idx_main = [i for i, item in enumerate(lengthes) if item == 1]\n\n if len(idx_main) == 0:\n raise ValueError(\"At least one unconfounded main factor is needed.\")\n\n # Check that single letters (main factors) are unique\n if len(idx_main) != len({generators[i] for i in idx_main}):\n raise ValueError(\"Main factors are confounded with each other.\")\n\n # Check that single letters (main factors) follow the alphabet\n if (\n \"\".join(sorted([generators[i] for i in idx_main]))\n != string.ascii_lowercase[: len(idx_main)]\n ):\n raise ValueError(\n f'Use the letters `{\" \".join(string.ascii_lowercase[: len(idx_main)])}` for the main factors.'\n )\n\n # Indices of letter combinations.\n idx_combi = [i for i, item in enumerate(generators) if item != 1]\n\n # check that main factors come before combinations\n if min(idx_combi) > max(idx_main):\n raise ValueError(\"Main factors have to come before combinations.\")\n\n # Check that letter combinations are unique\n if len(idx_combi) != len({generators[i] for i in idx_combi}):\n raise ValueError(\"Generators are not unique.\")\n\n # Check that only letters are used in the combinations that are also single letters (main factors)\n if not all(\n set(item).issubset({generators[i] for i in idx_main})\n for item in [generators[i] for i in idx_combi]\n ):\n raise ValueError(\"Generators are not valid.\")\n\n return generator\n
"},{"location":"ref-utils/#bofire.utils.multiobjective","title":"multiobjective
","text":""},{"location":"ref-utils/#bofire.utils.multiobjective.get_ref_point_mask","title":"get_ref_point_mask(domain, output_feature_keys=None)
","text":"Method to get a mask for the reference points taking into account if we want to maximize or minimize an objective. In case it is maximize the value in the mask is 1, in case we want to minimize it is -1.
Parameters:
Name Type Description Defaultdomain
Domain
Domain for which the mask should be generated.
requiredoutput_feature_keys
Optional[list]
Name of output feature keys that should be considered in the mask. Defaults to None.
None
Returns:
Type Descriptionnp.ndarray
description
Source code inbofire/utils/multiobjective.py
def get_ref_point_mask(\n domain: Domain, output_feature_keys: Optional[list] = None\n) -> np.ndarray:\n \"\"\"Method to get a mask for the reference points taking into account if we\n want to maximize or minimize an objective. In case it is maximize the value\n in the mask is 1, in case we want to minimize it is -1.\n\n Args:\n domain (Domain): Domain for which the mask should be generated.\n output_feature_keys (Optional[list], optional): Name of output feature keys\n that should be considered in the mask. Defaults to None.\n\n Returns:\n np.ndarray: _description_\n \"\"\"\n if output_feature_keys is None:\n output_feature_keys = domain.outputs.get_keys_by_objective(\n includes=[MaximizeObjective, MinimizeObjective, CloseToTargetObjective]\n )\n if len(output_feature_keys) < 2:\n raise ValueError(\"At least two output features have to be provided.\")\n mask = []\n for key in output_feature_keys:\n feat = domain.outputs.get_by_key(key)\n if isinstance(feat.objective, MaximizeObjective): # type: ignore\n mask.append(1.0)\n elif isinstance(feat.objective, MinimizeObjective): # type: ignore\n mask.append(-1.0)\n elif isinstance(feat.objective, CloseToTargetObjective): # type: ignore\n mask.append(-1.0)\n else:\n raise ValueError(\n \"Only `MaximizeObjective` and `MinimizeObjective` supported\"\n )\n return np.array(mask)\n
"},{"location":"ref-utils/#bofire.utils.naming_conventions","title":"naming_conventions
","text":""},{"location":"ref-utils/#bofire.utils.naming_conventions.get_column_names","title":"get_column_names(outputs)
","text":"Specifies column names for given Outputs type.
Parameters:
Name Type Description Defaultoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type DescriptionTuple[List[str], List[str]]
A tuple containing the prediction column names and the standard deviation column names
Source code inbofire/utils/naming_conventions.py
def get_column_names(outputs: Outputs) -> Tuple[List[str], List[str]]:\n \"\"\"\n Specifies column names for given Outputs type.\n\n Args:\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n Tuple[List[str], List[str]]: A tuple containing the prediction column names and the standard deviation column names\n \"\"\"\n pred_cols, sd_cols = [], []\n for featkey in outputs.get_keys(CategoricalOutput): # type: ignore\n pred_cols = pred_cols + [\n f\"{featkey}_{cat}_prob\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n sd_cols = sd_cols + [\n f\"{featkey}_{cat}_sd\"\n for cat in outputs.get_by_key(featkey).categories # type: ignore\n ]\n for featkey in outputs.get_keys(ContinuousOutput): # type: ignore\n pred_cols = pred_cols + [f\"{featkey}_pred\"]\n sd_cols = sd_cols + [f\"{featkey}_sd\"]\n\n return pred_cols, sd_cols\n
"},{"location":"ref-utils/#bofire.utils.naming_conventions.postprocess_categorical_predictions","title":"postprocess_categorical_predictions(predictions, outputs)
","text":"Postprocess categorical predictions by finding the maximum probability location
Parameters:
Name Type Description Defaultpredictions
pd.DataFrame
The dataframe containing the predictions.
requiredoutputs
Outputs
The Outputs object containing the individual outputs.
requiredReturns:
Type Descriptionpredictions (pd.DataFrame)
The (potentially modified) original dataframe with categorical predictions added
Source code inbofire/utils/naming_conventions.py
def postprocess_categorical_predictions(predictions: pd.DataFrame, outputs: Outputs) -> pd.DataFrame: # type: ignore\n \"\"\"\n Postprocess categorical predictions by finding the maximum probability location\n\n Args:\n predictions (pd.DataFrame): The dataframe containing the predictions.\n outputs (Outputs): The Outputs object containing the individual outputs.\n\n Returns:\n predictions (pd.DataFrame): The (potentially modified) original dataframe with categorical predictions added\n \"\"\"\n for feat in outputs.get():\n if isinstance(feat, CategoricalOutput): # type: ignore\n predictions.insert(\n loc=0,\n column=f\"{feat.key}_pred\",\n value=predictions.filter(regex=f\"{feat.key}(.*)_prob\")\n .idxmax(1)\n .str.replace(f\"{feat.key}_\", \"\")\n .str.replace(\"_prob\", \"\")\n .values,\n )\n predictions.insert(\n loc=1,\n column=f\"{feat.key}_sd\",\n value=0.0,\n )\n return predictions\n
"},{"location":"ref-utils/#bofire.utils.reduce","title":"reduce
","text":""},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform","title":" AffineTransform
","text":"Class to switch back and forth from the reduced to the original domain.
Source code inbofire/utils/reduce.py
class AffineTransform:\n \"\"\"Class to switch back and forth from the reduced to the original domain.\"\"\"\n\n def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n\n def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n\n def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform.__init__","title":"__init__(self, equalities)
special
","text":"Initializes a AffineTransformation
object.
Parameters:
Name Type Description Defaultequalities
List[Tuple[str,List[str],List[float]]]
List of equalities. Every equality is defined as a tuple, in which the first entry is the key of the reduced feature, the second one is a list of feature keys that can be used to compute the feature and the third list of floats are the corresponding coefficients.
required Source code inbofire/utils/reduce.py
def __init__(self, equalities: List[Tuple[str, List[str], List[float]]]):\n \"\"\"Initializes a `AffineTransformation` object.\n\n Args:\n equalities (List[Tuple[str,List[str],List[float]]]): List of equalities. Every equality\n is defined as a tuple, in which the first entry is the key of the reduced feature, the second\n one is a list of feature keys that can be used to compute the feature and the third list of floats\n are the corresponding coefficients.\n \"\"\"\n self.equalities = equalities\n
"},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform.augment_data","title":"augment_data(self, data)
","text":"Restore the eliminated features in a dataframe
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe that should be restored.
requiredReturns:
Type Descriptionpd.DataFrame
Restored dataframe
Source code inbofire/utils/reduce.py
def augment_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Restore the eliminated features in a dataframe\n\n Args:\n data (pd.DataFrame): Dataframe that should be restored.\n\n Returns:\n pd.DataFrame: Restored dataframe\n \"\"\"\n if len(self.equalities) == 0:\n return data\n data = data.copy()\n for name_lhs, names_rhs, coeffs in self.equalities:\n data[name_lhs] = coeffs[-1]\n for i, name in enumerate(names_rhs):\n data[name_lhs] += coeffs[i] * data[name]\n return data\n
"},{"location":"ref-utils/#bofire.utils.reduce.AffineTransform.drop_data","title":"drop_data(self, data)
","text":"Drop eliminated features from a dataframe.
Parameters:
Name Type Description Defaultdata
pd.DataFrame
Dataframe with features to be dropped.
requiredReturns:
Type Descriptionpd.DataFrame
Reduced dataframe.
Source code inbofire/utils/reduce.py
def drop_data(self, data: pd.DataFrame) -> pd.DataFrame:\n \"\"\"Drop eliminated features from a dataframe.\n\n Args:\n data (pd.DataFrame): Dataframe with features to be dropped.\n\n Returns:\n pd.DataFrame: Reduced dataframe.\n \"\"\"\n if len(self.equalities) == 0:\n return data\n drop = []\n for name_lhs, _, _ in self.equalities:\n if name_lhs in data.columns:\n drop.append(name_lhs)\n return data.drop(columns=drop)\n
"},{"location":"ref-utils/#bofire.utils.reduce.adjust_boundary","title":"adjust_boundary(feature, coef, rhs)
","text":"Adjusts the boundaries of a feature.
Parameters:
Name Type Description Defaultfeature
ContinuousInput
Feature to be adjusted.
requiredcoef
float
Coefficient.
requiredrhs
float
Right-hand-side of the constraint.
required Source code inbofire/utils/reduce.py
def adjust_boundary(feature: ContinuousInput, coef: float, rhs: float):\n \"\"\"Adjusts the boundaries of a feature.\n\n Args:\n feature (ContinuousInput): Feature to be adjusted.\n coef (float): Coefficient.\n rhs (float): Right-hand-side of the constraint.\n \"\"\"\n boundary = rhs / coef\n if coef > 0:\n if boundary > feature.lower_bound:\n feature.bounds = (boundary, feature.upper_bound)\n else:\n if boundary < feature.upper_bound:\n feature.bounds = (feature.lower_bound, boundary)\n
"},{"location":"ref-utils/#bofire.utils.reduce.check_domain_for_reduction","title":"check_domain_for_reduction(domain)
","text":"Check if the reduction can be applied or if a trivial case is present.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be checked.
requiredReturns:
Type Descriptionbool
True if reducable, else False.
Source code inbofire/utils/reduce.py
def check_domain_for_reduction(domain: Domain) -> bool:\n \"\"\"Check if the reduction can be applied or if a trivial case is present.\n\n Args:\n domain (Domain): Domain to be checked.\n Returns:\n bool: True if reducable, else False.\n \"\"\"\n # are there any constraints?\n if len(domain.constraints) == 0:\n return False\n\n # are there any linear equality constraints?\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n if len(linear_equalities) == 0:\n return False\n\n # are there no NChooseKConstraint constraints?\n if len(domain.constraints.get([NChooseKConstraint])) > 0:\n return False\n\n # are there continuous inputs\n continuous_inputs = domain.inputs.get(ContinuousInput)\n if len(continuous_inputs) == 0:\n return False\n\n # check that equality constraints only contain continuous inputs\n for c in linear_equalities:\n assert isinstance(c, LinearConstraint)\n for feat in c.features:\n if feat not in domain.inputs.get_keys(ContinuousInput):\n return False\n return True\n
"},{"location":"ref-utils/#bofire.utils.reduce.check_existence_of_solution","title":"check_existence_of_solution(A_aug)
","text":"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.
Source code inbofire/utils/reduce.py
def check_existence_of_solution(A_aug):\n \"\"\"Given an augmented coefficient matrix this function determines the existence (and uniqueness) of solution using the rank theorem.\"\"\"\n A = A_aug[:, :-1]\n b = A_aug[:, -1]\n len_inputs = np.shape(A)[1]\n\n # catch special cases\n rk_A_aug = np.linalg.matrix_rank(A_aug)\n rk_A = np.linalg.matrix_rank(A)\n\n if rk_A == rk_A_aug:\n if rk_A < len_inputs:\n return # all good\n else:\n x = np.linalg.solve(A, b)\n raise Exception(\n f\"There is a unique solution x for the linear equality constraints: x={x}\"\n )\n elif rk_A < rk_A_aug:\n raise Exception(\n \"There is no solution fulfilling the linear equality constraints.\"\n )\n
"},{"location":"ref-utils/#bofire.utils.reduce.reduce_domain","title":"reduce_domain(domain)
","text":"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.
Parameters:
Name Type Description Defaultdomain
Domain
Domain to be reduced.
requiredReturns:
Type DescriptionTuple[Domain, AffineTransform]
reduced domain and the according transformation to switch between the reduced and orginal domain.
Source code inbofire/utils/reduce.py
def reduce_domain(domain: Domain) -> Tuple[Domain, AffineTransform]:\n \"\"\"Reduce a domain with linear equality constraints to a subdomain where linear equality constraints are eliminated.\n\n Args:\n domain (Domain): Domain to be reduced.\n\n Returns:\n Tuple[Domain, AffineTransform]: reduced domain and the according transformation to switch between the\n reduced and orginal domain.\n \"\"\"\n # check if the domain can be reduced\n if not check_domain_for_reduction(domain):\n return domain, AffineTransform([])\n\n # find linear equality constraints\n linear_equalities = domain.constraints.get(LinearEqualityConstraint)\n other_constraints = domain.constraints.get(\n Constraint, excludes=[LinearEqualityConstraint]\n )\n\n # only consider continuous inputs\n continuous_inputs = [\n cast(ContinuousInput, f) for f in domain.inputs.get(ContinuousInput)\n ]\n other_inputs = domain.inputs.get(Input, excludes=[ContinuousInput])\n\n # assemble Matrix A from equality constraints\n N = len(linear_equalities)\n M = len(continuous_inputs) + 1\n names = np.concatenate(([feat.key for feat in continuous_inputs], [\"rhs\"]))\n\n A_aug = pd.DataFrame(data=np.zeros(shape=(N, M)), columns=names)\n\n for i in range(len(linear_equalities)):\n c = linear_equalities[i]\n assert isinstance(c, LinearEqualityConstraint)\n A_aug.loc[i, c.features] = c.coefficients # type: ignore\n A_aug.loc[i, \"rhs\"] = c.rhs\n A_aug = A_aug.values\n\n # catch special cases\n check_existence_of_solution(A_aug)\n\n # bring A_aug to reduced row-echelon form\n A_aug_rref, pivots = rref(A_aug)\n pivots = np.array(pivots)\n A_aug_rref = np.array(A_aug_rref).astype(np.float64)\n\n # formulate box bounds as linear inequality constraints in matrix form\n B = np.zeros(shape=(2 * (M - 1), M))\n B[: M - 1, : M - 1] = np.eye(M - 1)\n B[M - 1 :, : M - 1] = -np.eye(M - 1)\n\n B[: M - 1, -1] = np.array([feat.upper_bound for feat in continuous_inputs])\n B[M - 1 :, -1] = -1.0 * np.array([feat.lower_bound for feat in continuous_inputs])\n\n # eliminate columns with pivot element\n for i in range(len(pivots)):\n p = pivots[i]\n B[p, :] -= A_aug_rref[i, :]\n B[p + M - 1, :] += A_aug_rref[i, :]\n\n # build up reduced domain\n _domain = Domain.model_construct(\n # _fields_set = {\"inputs\", \"outputs\", \"constraints\"}\n inputs=deepcopy(other_inputs),\n outputs=deepcopy(domain.outputs),\n constraints=deepcopy(other_constraints),\n )\n new_inputs = [\n deepcopy(feat) for i, feat in enumerate(continuous_inputs) if i not in pivots\n ]\n all_inputs = _domain.inputs + new_inputs\n assert isinstance(all_inputs, Inputs)\n _domain.inputs.features = all_inputs.features\n\n constraints: List[AnyConstraint] = []\n for i in pivots:\n # reduce equation system of upper bounds\n ind = np.where(B[i, :-1] != 0)[0]\n if len(ind) > 0 and B[i, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i, ind]).tolist(),\n rhs=B[i, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(feat, (-1.0 * B[i, ind])[0], B[i, -1] * -1.0)\n else:\n if B[i, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n # reduce equation system of lower bounds\n ind = np.where(B[i + M - 1, :-1] != 0)[0]\n if len(ind) > 0 and B[i + M - 1, -1] < np.inf:\n if len(list(names[ind])) > 1:\n c = LinearInequalityConstraint.from_greater_equal(\n features=list(names[ind]),\n coefficients=(-1.0 * B[i + M - 1, ind]).tolist(),\n rhs=B[i + M - 1, -1] * -1.0,\n )\n constraints.append(c)\n else:\n key = names[ind][0]\n feat = cast(ContinuousInput, _domain.inputs.get_by_key(key))\n adjust_boundary(\n feat,\n (-1.0 * B[i + M - 1, ind])[0],\n B[i + M - 1, -1] * -1.0,\n )\n else:\n if B[i + M - 1, -1] < -1e-16:\n raise Exception(\"There is no solution that fulfills the constraints.\")\n\n if len(constraints) > 0:\n _domain.constraints.constraints = _domain.constraints.constraints + constraints # type: ignore\n\n # assemble equalities\n _equalities = []\n for i in range(len(pivots)):\n name_lhs = names[pivots[i]]\n names_rhs = []\n coeffs = []\n\n for j in range(len(names) - 1):\n if A_aug_rref[i, j] != 0 and j != pivots[i]:\n coeffs.append(-A_aug_rref[i, j])\n names_rhs.append(names[j])\n\n coeffs.append(A_aug_rref[i, -1])\n\n _equalities.append((name_lhs, names_rhs, coeffs))\n\n trafo = AffineTransform(_equalities)\n # remove remaining dependencies of eliminated inputs from the problem\n _domain = remove_eliminated_inputs(_domain, trafo)\n return _domain, trafo\n
"},{"location":"ref-utils/#bofire.utils.reduce.remove_eliminated_inputs","title":"remove_eliminated_inputs(domain, transform)
","text":"Eliminates remaining occurences of eliminated inputs in linear constraints.
Parameters:
Name Type Description Defaultdomain
Domain
Domain in which the linear constraints should be purged.
requiredtransform
AffineTransform
Affine transformation object that defines the obsolete features.
requiredExceptions:
Type DescriptionValueError
If feature occurs in a constraint different from a linear one.
Returns:
Type DescriptionDomain
Purged domain.
Source code inbofire/utils/reduce.py
def remove_eliminated_inputs(domain: Domain, transform: AffineTransform) -> Domain:\n \"\"\"Eliminates remaining occurences of eliminated inputs in linear constraints.\n\n Args:\n domain (Domain): Domain in which the linear constraints should be purged.\n transform (AffineTransform): Affine transformation object that defines the obsolete features.\n\n Raises:\n ValueError: If feature occurs in a constraint different from a linear one.\n\n Returns:\n Domain: Purged domain.\n \"\"\"\n inputs_names = domain.inputs.get_keys()\n M = len(inputs_names)\n\n # write the equalities for the backtransformation into one matrix\n inputs_dict = {inputs_names[i]: i for i in range(M)}\n\n # build up dict from domain.equalities e.g. {\"xi1\": [coeff(xj1), ..., coeff(xjn)], ... \"xik\":...}\n coeffs_dict = {}\n for e in transform.equalities:\n coeffs = np.zeros(M + 1)\n for j, name in enumerate(e[1]):\n coeffs[inputs_dict[name]] = e[2][j]\n coeffs[-1] = e[2][-1]\n coeffs_dict[e[0]] = coeffs\n\n constraints = []\n for c in domain.constraints.get():\n # Nonlinear constraints not supported\n if not isinstance(c, LinearConstraint):\n raise ValueError(\n \"Elimination of variables is only supported for LinearEquality and LinearInequality constraints.\"\n )\n\n # no changes, if the constraint does not contain eliminated inputs\n elif all(name in inputs_names for name in c.features):\n constraints.append(c)\n\n # remove inputs from the constraint that were eliminated from the inputs before\n else:\n totally_removed = False\n _features = np.array(inputs_names)\n _rhs = c.rhs\n\n # create new lhs and rhs from the old one and knowledge from problem._equalities\n _coefficients = np.zeros(M)\n for j, name in enumerate(c.features):\n if name in inputs_names:\n _coefficients[inputs_dict[name]] += c.coefficients[j]\n else:\n _coefficients += c.coefficients[j] * coeffs_dict[name][:-1]\n _rhs -= c.coefficients[j] * coeffs_dict[name][-1]\n\n _features = _features[np.abs(_coefficients) > 1e-16]\n _coefficients = _coefficients[np.abs(_coefficients) > 1e-16]\n _c = None\n if isinstance(c, LinearEqualityConstraint):\n if len(_features) > 1:\n _c = LinearEqualityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat: ContinuousInput = ContinuousInput(\n **domain.inputs.get_by_key(_features[0]).model_dump()\n )\n feat.bounds = (_coefficients[0], _coefficients[0])\n totally_removed = True\n else:\n if len(_features) > 1:\n _c = LinearInequalityConstraint(\n features=_features.tolist(),\n coefficients=_coefficients.tolist(),\n rhs=_rhs,\n )\n elif len(_features) == 0:\n totally_removed = True\n else:\n feat = cast(ContinuousInput, domain.inputs.get_by_key(_features[0]))\n adjust_boundary(feat, _coefficients[0], _rhs)\n totally_removed = True\n\n # check if constraint is always fulfilled/not fulfilled\n if not totally_removed:\n assert _c is not None\n if len(_c.features) == 0 and _c.rhs >= 0:\n pass\n elif len(_c.features) == 0 and _c.rhs < 0:\n raise Exception(\"Linear constraints cannot be fulfilled.\")\n elif np.isinf(_c.rhs):\n pass\n else:\n constraints.append(_c)\n domain.constraints = Constraints(constraints=constraints)\n return domain\n
"},{"location":"ref-utils/#bofire.utils.reduce.rref","title":"rref(A, tol=1e-08)
","text":"Computes the reduced row echelon form of a Matrix
Parameters:
Name Type Description DefaultA
ndarray
2d array representing a matrix.
requiredtol
float
tolerance for rounding to 0. Defaults to 1e-8.
1e-08
Returns:
Type DescriptionTuple[numpy.ndarray, List[int]]
(A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots is a numpy array containing the pivot columns of A_rref
Source code inbofire/utils/reduce.py
def rref(A: np.ndarray, tol: float = 1e-8) -> Tuple[np.ndarray, List[int]]:\n \"\"\"Computes the reduced row echelon form of a Matrix\n\n Args:\n A (ndarray): 2d array representing a matrix.\n tol (float, optional): tolerance for rounding to 0. Defaults to 1e-8.\n\n Returns:\n (A_rref, pivots), where A_rref is the reduced row echelon form of A and pivots\n is a numpy array containing the pivot columns of A_rref\n \"\"\"\n A = np.array(A, dtype=np.float64)\n n, m = np.shape(A)\n\n col = 0\n row = 0\n pivots = []\n\n for col in range(m):\n # does a pivot element exist?\n if all(np.abs(A[row:, col]) < tol):\n pass\n # if yes: start elimination\n else:\n pivots.append(col)\n max_row = np.argmax(np.abs(A[row:, col])) + row\n # switch to most stable row\n A[[row, max_row], :] = A[[max_row, row], :] # type: ignore\n # normalize row\n A[row, :] /= A[row, col]\n # eliminate other elements from column\n for r in range(n):\n if r != row:\n A[r, :] -= A[r, col] / A[row, col] * A[row, :]\n row += 1\n\n prec = int(-np.log10(tol))\n return np.round(A, prec), pivots\n
"},{"location":"ref-utils/#bofire.utils.subdomain","title":"subdomain
","text":""},{"location":"ref-utils/#bofire.utils.subdomain.get_subdomain","title":"get_subdomain(domain, feature_keys)
","text":"removes all features not defined as argument creating a subdomain of the provided domain
Parameters:
Name Type Description Defaultdomain
Domain
the original domain wherefrom a subdomain should be created
requiredfeature_keys
List
List of features that shall be included in the subdomain
requiredExceptions:
Type DescriptionAssert
when in total less than 2 features are provided
ValueError
when a provided feature key is not present in the provided domain
Assert
when no output feature is provided
Assert
when no input feature is provided
ValueError
description
Returns:
Type DescriptionDomain
A new domain containing only parts of the original domain
Source code inbofire/utils/subdomain.py
def get_subdomain(\n domain: Domain,\n feature_keys: List,\n) -> Domain:\n \"\"\"removes all features not defined as argument creating a subdomain of the provided domain\n\n Args:\n domain (Domain): the original domain wherefrom a subdomain should be created\n feature_keys (List): List of features that shall be included in the subdomain\n\n Raises:\n Assert: when in total less than 2 features are provided\n ValueError: when a provided feature key is not present in the provided domain\n Assert: when no output feature is provided\n Assert: when no input feature is provided\n ValueError: _description_\n\n Returns:\n Domain: A new domain containing only parts of the original domain\n \"\"\"\n assert len(feature_keys) >= 2, \"At least two features have to be provided.\"\n outputs = []\n inputs = []\n for key in feature_keys:\n try:\n feat = (domain.inputs + domain.outputs).get_by_key(key)\n except KeyError:\n raise ValueError(f\"Feature {key} not present in domain.\")\n if isinstance(feat, Input):\n inputs.append(feat)\n else:\n outputs.append(feat)\n assert len(outputs) > 0, \"At least one output feature has to be provided.\"\n assert len(inputs) > 0, \"At least one input feature has to be provided.\"\n inputs = Inputs(features=inputs)\n outputs = Outputs(features=outputs)\n # loop over constraints and make sure that all features used in constraints are in the input_feature_keys\n for c in domain.constraints:\n for key in c.features: # type: ignore\n if key not in inputs.get_keys():\n raise ValueError(\n f\"Removed input feature {key} is used in a constraint.\"\n )\n subdomain = deepcopy(domain)\n subdomain.inputs = inputs\n subdomain.outputs = outputs\n return subdomain\n
"},{"location":"ref-utils/#bofire.utils.torch_tools","title":"torch_tools
","text":""},{"location":"ref-utils/#bofire.utils.torch_tools.constrained_objective2botorch","title":"constrained_objective2botorch(idx, objective, eps=1e-08)
","text":"Create a callable that can be used by botorch.utils.objective.apply_constraints
to setup ouput constrained optimizations.
Parameters:
Name Type Description Defaultidx
int
Index of the constraint objective in the list of outputs.
requiredobjective
BotorchConstrainedObjective
The objective that should be transformed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float], int]
List of callables that can be used by botorch for setting up the constrained objective, list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)
Source code inbofire/utils/torch_tools.py
def constrained_objective2botorch(\n idx: int, objective: ConstrainedObjective, eps: float = 1e-8\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float], int]:\n \"\"\"Create a callable that can be used by `botorch.utils.objective.apply_constraints`\n to setup ouput constrained optimizations.\n\n Args:\n idx (int): Index of the constraint objective in the list of outputs.\n objective (BotorchConstrainedObjective): The objective that should be transformed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float], int]: List of callables that can be used by botorch for setting up the constrained objective,\n list of the corresponding botorch eta values, final index used by the method (to track for categorical variables)\n \"\"\"\n assert isinstance(\n objective, ConstrainedObjective\n ), \"Objective is not a `ConstrainedObjective`.\"\n if isinstance(objective, MaximizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp) * -1.0],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, MinimizeSigmoidObjective):\n return (\n [lambda Z: (Z[..., idx] - objective.tp)],\n [1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, TargetObjective):\n return (\n [\n lambda Z: (Z[..., idx] - (objective.target_value - objective.tolerance))\n * -1.0,\n lambda Z: (\n Z[..., idx] - (objective.target_value + objective.tolerance)\n ),\n ],\n [1.0 / objective.steepness, 1.0 / objective.steepness],\n idx + 1,\n )\n elif isinstance(objective, ConstrainedCategoricalObjective):\n # The output of a categorical objective has final dim `c` where `c` is number of classes\n # Pass in the expected acceptance probability and perform an inverse sigmoid to atain the original probabilities\n return (\n [\n lambda Z: torch.log(\n 1\n / torch.clamp(\n (\n Z[..., idx : idx + len(objective.desirability)]\n * torch.tensor(objective.desirability).to(**tkwargs)\n ).sum(-1),\n min=eps,\n max=1 - eps,\n )\n - 1,\n )\n ],\n [1.0],\n idx + len(objective.desirability),\n )\n else:\n raise ValueError(f\"Objective {objective.__class__.__name__} not known.\")\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_initial_conditions_generator","title":"get_initial_conditions_generator(strategy, transform_specs, ask_options=None, sequential=True)
","text":"Takes a strategy object and returns a callable which uses this strategy to return a generator callable which can be used in botorchs
gen_batch_initial_conditions` to generate samples.
Parameters:
Name Type Description Defaultstrategy
Strategy
Strategy that should be used to generate samples.
requiredtransform_specs
Dict
Dictionary indicating how the samples should be transformed.
requiredask_options
Dict
Dictionary of keyword arguments that are passed to the ask
method of the strategy. Defaults to {}.
None
sequential
bool
If True, samples for every q-batch are generate indepenent from each other. If False, the n x q
samples are generated at once.
True
Returns:
Type DescriptionCallable[[int, int, int], Tensor]
Callable that can be passed to batch_initial_conditions
.
bofire/utils/torch_tools.py
def get_initial_conditions_generator(\n strategy: Strategy,\n transform_specs: Dict,\n ask_options: Optional[Dict] = None,\n sequential: bool = True,\n) -> Callable[[int, int, int], Tensor]:\n \"\"\"Takes a strategy object and returns a callable which uses this\n strategy to return a generator callable which can be used in botorch`s\n `gen_batch_initial_conditions` to generate samples.\n\n Args:\n strategy (Strategy): Strategy that should be used to generate samples.\n transform_specs (Dict): Dictionary indicating how the samples should be\n transformed.\n ask_options (Dict, optional): Dictionary of keyword arguments that are\n passed to the `ask` method of the strategy. Defaults to {}.\n sequential (bool, optional): If True, samples for every q-batch are\n generate indepenent from each other. If False, the `n x q` samples\n are generated at once.\n\n Returns:\n Callable[[int, int, int], Tensor]: Callable that can be passed to\n `batch_initial_conditions`.\n \"\"\"\n if ask_options is None:\n ask_options = {}\n\n def generator(n: int, q: int, seed: int) -> Tensor:\n if sequential:\n initial_conditions = []\n for _ in range(n):\n candidates = strategy.ask(q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n # transform to tensor\n initial_conditions.append(\n torch.from_numpy(transformed_candidates.values).to(**tkwargs)\n )\n return torch.stack(initial_conditions, dim=0)\n else:\n candidates = strategy.ask(n * q, **ask_options)\n # transform it\n transformed_candidates = strategy.domain.inputs.transform(\n candidates, transform_specs\n )\n return (\n torch.from_numpy(transformed_candidates.values)\n .to(**tkwargs)\n .reshape(n, q, transformed_candidates.shape[1])\n )\n\n return generator\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_interpoint_constraints","title":"get_interpoint_constraints(domain, n_candidates)
","text":"Converts interpoint equality constraints to linear equality constraints, that can be processed by botorch. For more information, see the docstring of optimize_acqf
in botorch (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredn_candidates
int
Number of candidates that should be requested.
requiredReturns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_interpoint_constraints(\n domain: Domain, n_candidates: int\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts interpoint equality constraints to linear equality constraints,\n that can be processed by botorch. For more information, see the docstring\n of `optimize_acqf` in botorch\n (https://github.com/pytorch/botorch/blob/main/botorch/optim/optimize.py).\n\n Args:\n domain (Domain): Optimization problem definition.\n n_candidates (int): Number of candidates that should be requested.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists\n of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for constraint in domain.constraints.get(InterpointEqualityConstraint):\n assert isinstance(constraint, InterpointEqualityConstraint)\n coefficients = torch.tensor([1.0, -1.0]).to(**tkwargs)\n feat_idx = domain.inputs.get_keys(Input).index(constraint.feature)\n feat = domain.inputs.get_by_key(constraint.feature)\n assert isinstance(feat, ContinuousInput)\n if feat.is_fixed():\n continue\n multiplicity = constraint.multiplicity or n_candidates\n for i in range(math.ceil(n_candidates / multiplicity)):\n all_indices = torch.arange(\n i * multiplicity, min((i + 1) * multiplicity, n_candidates)\n )\n for k in range(len(all_indices) - 1):\n indices = torch.tensor(\n [[all_indices[0], feat_idx], [all_indices[k + 1], feat_idx]],\n dtype=torch.int64,\n )\n constraints.append((indices, coefficients, 0.0))\n return constraints\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_linear_constraints","title":"get_linear_constraints(domain, constraint, unit_scaled=False)
","text":"Converts linear constraints to the form required by BoTorch.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredconstraint
Union[Type[bofire.data_models.constraints.linear.LinearEqualityConstraint], Type[bofire.data_models.constraints.linear.LinearInequalityConstraint]]
Type of constraint that should be converted.
requiredunit_scaled
bool
If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.
False
Returns:
Type DescriptionList[Tuple[Tensor, Tensor, float]]
List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.
Source code inbofire/utils/torch_tools.py
def get_linear_constraints(\n domain: Domain,\n constraint: Union[Type[LinearEqualityConstraint], Type[LinearInequalityConstraint]],\n unit_scaled: bool = False,\n) -> List[Tuple[Tensor, Tensor, float]]:\n \"\"\"Converts linear constraints to the form required by BoTorch.\n\n Args:\n domain: Optimization problem definition.\n constraint: Type of constraint that should be converted.\n unit_scaled: If True, transforms constraints by assuming that the bound for the continuous features are [0,1]. Defaults to False.\n\n Returns:\n List[Tuple[Tensor, Tensor, float]]: List of tuples, each tuple consists of a tensor with the feature indices, coefficients and a float for the rhs.\n \"\"\"\n constraints = []\n for c in domain.constraints.get(constraint):\n indices = []\n coefficients = []\n lower = []\n upper = []\n rhs = 0.0\n for i, featkey in enumerate(c.features): # type: ignore\n idx = domain.inputs.get_keys(Input).index(featkey)\n feat = domain.inputs.get_by_key(featkey)\n if feat.is_fixed(): # type: ignore\n rhs -= feat.fixed_value()[0] * c.coefficients[i] # type: ignore\n else:\n lower.append(feat.lower_bound) # type: ignore\n upper.append(feat.upper_bound) # type: ignore\n indices.append(idx)\n coefficients.append(\n c.coefficients[i] # type: ignore\n ) # if unit_scaled == False else c_scaled.coefficients[i])\n if unit_scaled:\n lower = np.array(lower)\n upper = np.array(upper)\n s = upper - lower\n scaled_coefficients = s * np.array(coefficients)\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(scaled_coefficients).to(**tkwargs),\n -(rhs + c.rhs - np.sum(np.array(coefficients) * lower)), # type: ignore\n )\n )\n else:\n constraints.append(\n (\n torch.tensor(indices),\n -torch.tensor(coefficients).to(**tkwargs),\n -(rhs + c.rhs), # type: ignore\n )\n )\n return constraints\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_multiobjective_objective","title":"get_multiobjective_objective(outputs)
","text":"Returns
Parameters:
Name Type Description Defaultoutputs
Outputs
description
requiredReturns:
Type DescriptionCallable[[Tensor], Tensor]
description
Source code inbofire/utils/torch_tools.py
def get_multiobjective_objective(\n outputs: Outputs,\n) -> Callable[[Tensor, Optional[Tensor]], Tensor]:\n \"\"\"Returns\n\n Args:\n outputs (Outputs): _description_\n\n Returns:\n Callable[[Tensor], Tensor]: _description_\n \"\"\"\n callables = [\n get_objective_callable(idx=i, objective=feat.objective) # type: ignore\n for i, feat in enumerate(outputs.get())\n if feat.objective is not None # type: ignore\n and isinstance(\n feat.objective, # type: ignore\n (MaximizeObjective, MinimizeObjective, CloseToTargetObjective),\n )\n ]\n\n def objective(samples: Tensor, X: Optional[Tensor] = None) -> Tensor:\n return torch.stack([c(samples, None) for c in callables], dim=-1)\n\n return objective\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_nchoosek_constraints","title":"get_nchoosek_constraints(domain)
","text":"Transforms NChooseK constraints into a list of non-linear inequality constraint callables that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered at zero.
Parameters:
Name Type Description Defaultdomain
Domain
Optimization problem definition.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
List of callables that can be used as nonlinear equality constraints in botorch.
Source code inbofire/utils/torch_tools.py
def get_nchoosek_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"Transforms NChooseK constraints into a list of non-linear inequality constraint callables\n that can be parsed by pydantic. For this purpose the NChooseK constraint is continuously\n relaxed by countig the number of zeros in a candidate by a sum of narrow gaussians centered\n at zero.\n\n Args:\n domain (Domain): Optimization problem definition.\n\n Returns:\n List[Callable[[Tensor], float]]: List of callables that can be used\n as nonlinear equality constraints in botorch.\n \"\"\"\n\n def narrow_gaussian(x, ell=1e-3):\n return torch.exp(-0.5 * (x / ell) ** 2)\n\n def max_constraint(indices: Tensor, num_features: int, max_count: int):\n return lambda x: narrow_gaussian(x=x[..., indices]).sum(dim=-1) - (\n num_features - max_count\n )\n\n def min_constraint(indices: Tensor, num_features: int, min_count: int):\n return lambda x: -narrow_gaussian(x=x[..., indices]).sum(dim=-1) + (\n num_features - min_count\n )\n\n constraints = []\n # ignore none also valid for the start\n for c in domain.constraints.get(NChooseKConstraint):\n assert isinstance(c, NChooseKConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n if c.max_count != len(c.features):\n constraints.append(\n max_constraint(\n indices=indices, num_features=len(c.features), max_count=c.max_count\n )\n )\n if c.min_count > 0:\n constraints.append(\n min_constraint(\n indices=indices, num_features=len(c.features), min_count=c.min_count\n )\n )\n return constraints\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_nonlinear_constraints","title":"get_nonlinear_constraints(domain)
","text":"Returns a list of callable functions that represent the nonlinear constraints for the given domain that can be processed by botorch.
Parameters:
Name Type Description Defaultdomain
Domain
The domain for which to generate the nonlinear constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of callable functions that take a tensor as input and return a float value representing the constraint evaluation.
Source code inbofire/utils/torch_tools.py
def get_nonlinear_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of callable functions that represent the nonlinear constraints\n for the given domain that can be processed by botorch.\n\n Parameters:\n domain (Domain): The domain for which to generate the nonlinear constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of callable functions that take a tensor\n as input and return a float value representing the constraint evaluation.\n \"\"\"\n return get_nchoosek_constraints(domain) + get_product_constraints(domain)\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_output_constraints","title":"get_output_constraints(outputs)
","text":"Method to translate output constraint objectives into a list of callables and list of etas for use in botorch.
Parameters:
Name Type Description Defaultoutputs
Outputs
Output feature object that should be processed.
requiredReturns:
Type DescriptionTuple[List[Callable[[Tensor], Tensor]], List[float]]
List of constraint callables, list of associated etas.
Source code inbofire/utils/torch_tools.py
def get_output_constraints(\n outputs: Outputs,\n) -> Tuple[List[Callable[[Tensor], Tensor]], List[float]]:\n \"\"\"Method to translate output constraint objectives into a list of\n callables and list of etas for use in botorch.\n\n Args:\n outputs (Outputs): Output feature object that should\n be processed.\n\n Returns:\n Tuple[List[Callable[[Tensor], Tensor]], List[float]]: List of constraint callables,\n list of associated etas.\n \"\"\"\n constraints = []\n etas = []\n idx = 0\n for feat in outputs.get():\n if isinstance(feat.objective, ConstrainedObjective): # type: ignore\n iconstraints, ietas, idx = constrained_objective2botorch(\n idx,\n objective=feat.objective, # type: ignore\n )\n constraints += iconstraints\n etas += ietas\n else:\n idx += 1\n return constraints, etas\n
"},{"location":"ref-utils/#bofire.utils.torch_tools.get_product_constraints","title":"get_product_constraints(domain)
","text":"Returns a list of nonlinear constraint functions that can be processed by botorch based on the given domain.
Parameters:
Name Type Description Defaultdomain
Domain
The domain object containing the constraints.
requiredReturns:
Type DescriptionList[Callable[[Tensor], float]]
A list of product constraint functions.
Source code inbofire/utils/torch_tools.py
def get_product_constraints(domain: Domain) -> List[Callable[[Tensor], float]]:\n \"\"\"\n Returns a list of nonlinear constraint functions that can be processed by botorch\n based on the given domain.\n\n Args:\n domain (Domain): The domain object containing the constraints.\n\n Returns:\n List[Callable[[Tensor], float]]: A list of product constraint functions.\n\n \"\"\"\n\n def product_constraint(indices: Tensor, exponents: Tensor, rhs: float, sign: int):\n return lambda x: -1.0 * sign * (x[..., indices] ** exponents).prod(dim=-1) + rhs\n\n constraints = []\n for c in domain.constraints.get(ProductInequalityConstraint):\n assert isinstance(c, ProductInequalityConstraint)\n indices = torch.tensor(\n [domain.inputs.get_keys(ContinuousInput).index(key) for key in c.features],\n dtype=torch.int64,\n )\n constraints.append(\n product_constraint(indices, torch.tensor(c.exponents), c.rhs, c.sign)\n )\n return constraints\n
"},{"location":"userguide_surrogates/","title":"Surrogate models","text":"In Bayesian Optimization, information from previous experiments is taken into account to generate proposals for future experiments. This information is leveraged by creating a surrogate model for the black-box function that is to be optimized based on the available data. Naturally, experimental candidates for which the surrogate model makes a promising prediction (e.g., high predicted values of a quantity we want to maximize) should be chosen over ones for which this is not the case. However, since the available data might cover only a small part of the input space, the model is likely to only be able to make very uncertain predictions far away from the data. Therefore, the surrogate model should be able to express the degree to which the predictions are uncertain so that we can use this information - combining the prediction and the associated uncertainty - to select the settings for the next experimental iteration.
The acquisition function is the object that turns the predicted distribution (you can think of this as the prediction and the prediction uncertainty) into a single quantity representing how promising a candidate experimental point seems. This function determines if one rather wants to focus on exploitation, i.e., quickly approaching a close local optimum of the black-box function, or on exploration, i.e., exploring different regions of the input space first.
Therefore, three criteria typically determine whether any candidate is selected as experimental proposal: the value of the surrogate model, the uncertainty of the model, and the acquisition function.
"},{"location":"userguide_surrogates/#surrogate-model-options","title":"Surrogate model options","text":"BoFire offers the following classes of surrogate models.
Surrogate Optimization of When to use Type SingleTaskGPSurrogate a single objective with real valued inputs Limited data and black-box function is smooth Gaussian process RandomForestSurrogate a single objective Rich data; black-box function does not have to be smooth sklearn random forest implementation MLP a single objective with real-valued inputs Rich data and black-box function is smooth Multi layer perceptron MixedSingleTaskGPSurrogate a single objective with categorical and real valued inputs Limited data and black-box function is smooth Gaussian process XGBoostSurrogate a single objective Rich data; black-box function does not have to be smooth xgboost implementation of gradient boosting trees TanimotoGP a single objective At least one input feature is a molecule represented as fingerprint Gaussian process on a molecule space for which Tanimoto similarity determines the similarity between pointsAll of these are single-objective surrogate models. For optimization of multiple objectives at the same time, a suitable Strategy has to be chosen. Then for each objective a different surrogate model can be specified. By default the SingleTaskGPSurrogate is used.
Example:
surrogate_data_0 = SingleTaskGPSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[0]]),\n)\nsurrogate_data_1 = XGBoostSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[1]]),\n)\nqparego_data_model = QparegoStrategy(\n domain=domain,\n surrogate_specs=BotorchSurrogates(\n surrogates=[surrogate_data_0, surrogate_data_1]\n ),\n)\n
Note:
BoFire also offers the option to customize surrogate models. In particular, it is possible to customize the SingleTaskGPSurrogate in the following ways.
"},{"location":"userguide_surrogates/#kernel-customization","title":"Kernel customization","text":"Specify the Kernel:
Kernel Description Translation invariant Input variable type RBFKernel Based on Gaussian distribution Yes Continuous MaternKernel Based on Gamma function; allows setting a smoothness parameter Yes Continuous PolynomialKernel Based on dot-product of two vectors of input points No Continuous LinearKernel Equal to dot-product of two vectors of input points No Continuous TanimotoKernel Measures similarities between binary vectors using Tanimoto Similiarity Not applicable MolecularInput HammingDistanceKernel Similarity is defined by the Hamming distance which considers the number of equal entries between two vectors (e.g., in One-Hot-encoding) Not applicable CategoricalTranslational invariance means that the similarity between two input points is not affected by shifting both points by the same amount but only determined by their distance. Example: with a translationally invariant kernel, the values 10 and 20 are equally similar to each other as the values 20 and 30, while with a polynomial kernel the latter pair has potentially higher similarity. Polynomial kernels are often suitable for high-dimensional inputs while for low-dimensional inputs an RBF or Mat\u00e9rn kernel is recommended.
Note: - SingleTaskGPSurrogate with PolynomialKernel is equivalent to PolynomialSurrogate. - SingleTaskGPSurrogate with LinearKernel is equivalent to LinearSurrogate. - SingleTaskGPSurrogate with TanimotoKernel is equivalent to TanimotoGP. - One can combine two Kernels by using AdditiveKernel or MultiplicativeKernel.
Example:
surrogate_data_0 = SingleTaskGPSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[0]]),\n kernel=PolynomialKernel(power=2)\n)\n
"},{"location":"userguide_surrogates/#noise-model-customization","title":"Noise model customization","text":"For experimental data subject to noise, one can specify the distribution of this noise. The options are:
Noise Model When to use NormalPrior Noise is Gaussian GammaPrior Noise has a Gamma distributionExample:
surrogate_data_0 = SingleTaskGPSurrogate(\n inputs=domain.inputs,\n outputs=Outputs(features=[domain.outputs[0]]),\n kernel=PolynomialKernel(power=2),\n noise_prior=NormalPrior(loc=0, scale=1)\n)\n
"}]}
\ No newline at end of file
diff --git a/userguide_surrogates/index.html b/userguide_surrogates/index.html
index 1367b2d48..ea1e16330 100644
--- a/userguide_surrogates/index.html
+++ b/userguide_surrogates/index.html
@@ -275,7 +275,7 @@
- Introduction
+ Home