Trust region Bayesian optimization#

This notebook guides you through practical examples of trust region Bayesian optimization, illustrating algorithms like TREGO [DPRP22] and TuRBO [EPG+19] that could be beneficial for optimizing high-dimensional spaces. Trieste provides a flexible framework for implementing custom algorithms by encapsulating the behavior of rules and regions into separate abstract classes, BatchTrustRegion and UpdatableTrustRegion respectively.

Trust region optimization is a general strategy used in optimization problems where the solution space is navigated by fitting a localized model, such as a quadratic approximation, around the current point estimate and then using this model to take an optimization step. The region where this approximation is considered trustworthy is the “trust region”. Within this region, the algorithm deems the model’s predictions to be reliable enough to base decisions on. After each step, the performance of the actual function is evaluated and the trust region is adjusted accordingly – typically expanded if the model was predictive or contracted if it wasn’t.

In trust region Bayesian optimization, a probabilistic model, often a Gaussian Process, replaces simpler models like quadratics to inform the optimization. This model can be applied globally across the entire search space or adjusted to accommodate local regions.

[1]:
import numpy as np
import tensorflow as tf

np.random.seed(1793)
tf.random.set_seed(1793)

Define the problem and model#

We can use trust regions for Bayesian optimization in much the same way as we used EGO and EI in the introduction notebook. Since the setup is very similar to that tutorial, we’ll skip over most of the detail.

[2]:
import trieste
from trieste.objectives import Branin

branin = Branin.objective
search_space = Branin.search_space

num_initial_data_points = 10
initial_query_points = search_space.sample(num_initial_data_points)
observer = trieste.objectives.utils.mk_observer(branin)
initial_data = observer(initial_query_points)
2024-11-05 17:30:50,424 INFO util.py:154 -- Missing packages: ['ipywidgets']. Run `pip install -U ipywidgets`, then restart the notebook server for rich notebook output.

As usual, we’ll use Gaussian process regression to model the function. Note that we set the likelihood variance to a small number because we are dealing with a noise-free problem.

[3]:
from trieste.models.gpflow import GaussianProcessRegression, build_gpr


def build_model():
    gpflow_model = build_gpr(
        initial_data, search_space, likelihood_variance=1e-7
    )
    return GaussianProcessRegression(gpflow_model)

Trust region TREGO acquisition rule#

First we demonstrate how to run Bayesian optimization with the TREGO algorithm, which alternates between regular EGO steps and local steps within one trust region (see [DPRP22]).

Create TREGO rule and run optimization loop#

We can run the Bayesian optimization loop by defining a BayesianOptimizer and calling its optimize method with the trust region rule. Once the optimization loop is complete, the optimizer will return one new query point for every step in the loop; that’s 5 points in total.

The trust region rule is created by instantiating the concrete BatchTrustRegionBox class. This is a “meta” rule that manages the acquisition from multiple possible regions by applying a base-rule to each region. The default base-rule is EfficientGlobalOptimization, but a different base-rule can be provided as an argument to BatchTrustRegionBox. Here we explicitly set it to make usage clear.

The regions themselves are implemented as separate classes. The TREGO algorithm in this example requires a single region that alternates between the full search space and the trust region. This is achieved by creating one instance of the TREGOBox class, which is responsible for managing the state, initialization and update of the region.

[4]:
trego_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
    trieste.acquisition.rule.TREGOBox(search_space),
    rule=trieste.acquisition.rule.EfficientGlobalOptimization(),
)
bo = trieste.bayesian_optimizer.BayesianOptimizer(observer, search_space)

num_steps = 5
result = bo.optimize(
    num_steps, initial_data, build_model(), trego_acq_rule, track_state=True
)
dataset = result.try_get_final_dataset()
Optimization completed without errors

Visualizing TREGO results#

Let’s take a look at where we queried the observer, the original query points (crosses), new query points (dots) and the optimum point found (purple dot), and where they lie with respect to the contours of the Branin function.

[5]:
from trieste.experimental.plotting import plot_bo_points, plot_function_2d


def plot_final_result(_dataset: trieste.data.Dataset) -> None:
    arg_min_idx = tf.squeeze(tf.argmin(_dataset.observations, axis=0))
    query_points = _dataset.query_points.numpy()
    _, ax = plot_function_2d(
        branin,
        search_space.lower,
        search_space.upper,
        grid_density=40,
        contour=True,
    )

    plot_bo_points(query_points, ax[0, 0], num_initial_data_points, arg_min_idx)


plot_final_result(dataset)
../_images/notebooks_trust_region_9_0.png

We can also visualize the progress of the optimization by plotting the acquisition space at each step. This space is either the full search space or the trust region, depending on the step, and is shown as a translucent box. The new query points per region are plotted in matching color.

Note there is only one trust region in this plot, however the rules in the following sections will show multiple trust regions.

[6]:
import base64
from typing import Optional

import IPython
import matplotlib.pyplot as plt

from trieste.experimental.plotting import (
    convert_figure_to_frame,
    convert_frames_to_gif,
    plot_trust_region_history_2d,
)


def plot_history(
    result: trieste.bayesian_optimizer.OptimizationResult,
    num_query_points: Optional[int] = None,
) -> None:
    frames = []
    for step, hist in enumerate(
        result.history + [result.final_result.unwrap()]
    ):
        fig, _ = plot_trust_region_history_2d(
            branin,
            search_space.lower,
            search_space.upper,
            hist,
            num_query_points=num_query_points,
            num_init=num_initial_data_points,
        )

        if fig is not None:
            fig.suptitle(f"step number {step}")
            frames.append(convert_figure_to_frame(fig))
            plt.close(fig)

    gif_file = convert_frames_to_gif(frames)
    gif = IPython.display.HTML(
        '<img src="data:image/gif;base64,{0}"/>'.format(
            base64.b64encode(gif_file.getvalue()).decode()
        )
    )
    IPython.display.display(gif)


plot_history(result)

Batch trust region rule#

Next we demonstrate how to run Bayesian optimization in a parallel batch with 5 regions.

Create the batch trust region acquisition rule#

Multiple trust regions are created by providing a list of regions to the BatchTrustRegionBox rule constructor. In this example, we create 5 SingleObjectiveTrustRegionBox regions. This class encapsulates the behavior of a trust region in a single region; being responsible for maintaining its own state, initializing it, and updating it after each step. Each region contributes one query point, resulting in each acquisition step collecting 5 points overall. As the optimization process continues, the bounds of these regions are dynamically updated.

In order to perform parallel acquisition across multiple regions with BatchTrustRegionBox “meta” rule, we need to specify a batch base-rule that supports parallel optimization. For our example we use EfficientGlobalOptimization coupled with the ParallelContinuousThompsonSampling acquisition function. The batch size is controlled by the num_query_points argument.

Note: in this example the number of regions is equal to the number of batch query points in the base-rule. This results in each region contributing one query point to the overall batch. However, it is possible to generate multiple query points from each region by setting num_query_points to be a multiple Q of the number of regions. In that case, each region will contribute Q query points to the overall batch.

[7]:
num_query_points = 5

init_regions = [
    trieste.acquisition.rule.SingleObjectiveTrustRegionBox(search_space)
    for _ in range(num_query_points)
]
base_rule = trieste.acquisition.rule.EfficientGlobalOptimization(  # type: ignore[var-annotated]
    builder=trieste.acquisition.ParallelContinuousThompsonSampling(),
    num_query_points=num_query_points,
)
batch_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
    init_regions, base_rule
)

Run the optimization loop#

We run the Bayesian optimization loop as before by defining a BayesianOptimizer and calling its optimize method with the trust region rule. Once the optimization loop is complete, the optimizer will return num_query_points new query points for every step in the loop. With 5 steps, that’s 25 points in total.

[8]:
bo = trieste.bayesian_optimizer.BayesianOptimizer(observer, search_space)

num_steps = 5
result = bo.optimize(
    num_steps, initial_data, build_model(), batch_acq_rule, track_state=True
)
dataset = result.try_get_final_dataset()
Optimization completed without errors

Visualizing batch trust region results#

We visualize the results as before.

[9]:
plot_final_result(dataset)
../_images/notebooks_trust_region_17_0.png
[10]:
plot_history(result)

Trust region TuRBO acquisition rule#

Finally, we show how to run Bayesian optimization with the TuRBO algorithm. This is a trust region algorithm that uses local models and datasets to approximate the objective function within their respective trust regions (see [EPG+19]).

Create TuRBO rule and run optimization loop#

As before, this meta-rule requires the specification of an aquisition base-rule for performing optimization within the trust regions; for our example we use the DiscreteThompsonSampling rule.

We create 2 TuRBO trust regions and associated local models by initially copying the global model (using copy_to_local_models). The optimizer will return num_query_points new query points for each region in every step of the loop. With 5 steps and 2 regions, that’s 30 points in total.

Note: this behavior of the base-rule num_query_points argument is different from the batch-trust-region example above. In the batch-trust-region example, the total number of query points returned per step was num_query_points. In this example, the total number of query points returned per step is num_query_points * num_regions. This depends on whether the base-rule is run in parallel across all regions or in parallel only within a region. This example runs the acquisition sequentially one region at a time, wherease the batch-trust-region case performed the acquisition in parallel across all regions. Fully parallel acquisition is only supported when using EfficientGlobalOptimization base-rule without local models.

[11]:
num_regions = 2
num_query_points = 3

turbo_regions = [
    trieste.acquisition.rule.TURBOBox(search_space) for _ in range(num_regions)
]
dts_rule = trieste.acquisition.rule.DiscreteThompsonSampling(
    num_search_space_samples=500, num_query_points=num_query_points
)
turbo_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
    turbo_regions, dts_rule
)

bo = trieste.bayesian_optimizer.BayesianOptimizer(observer, search_space)

num_steps = 5
result = bo.optimize(
    num_steps,
    {trieste.observer.OBJECTIVE: initial_data},
    trieste.acquisition.utils.copy_to_local_models(build_model(), num_regions),
    turbo_acq_rule,
    track_state=True,
)
dataset = result.try_get_final_dataset()
Optimization completed without errors

Visualizing TuRBO results#

We display the results as earlier.

[12]:
plot_final_result(dataset)
../_images/notebooks_trust_region_22_0.png
[13]:
plot_history(result, num_regions * num_query_points)

LICENSE#

Apache License 2.0