trieste.acquisition.optimizer#

This module contains functionality for optimizing AcquisitionFunctions over SearchSpaces.

Module Contents#

NUM_SAMPLES_MIN: int = 5000[source]#: The default minimum number of initial samples for generate_continuous_optimizer() and generate_random_search_optimizer() function, used for determining the number of initial samples in the multi-start acquisition function optimization.

NUM_SAMPLES_DIM: int = 1000[source]#: The default minimum number of initial samples per dimension of the search space for generate_continuous_optimizer() function in automatic_optimizer_selector(), used for determining the number of initial samples in the multi-start acquisition function optimization.

NUM_RUNS_DIM: int = 10[source]#: The default minimum number of optimization runs per dimension of the search space for generate_continuous_optimizer() function in automatic_optimizer_selector(), used for determining the number of acquisition function optimizations to be performed in parallel.

exception FailedOptimizationError[source]#

Bases: Exception

Raised when an acquisition optimizer fails to optimize

Initialize self. See help(type(self)) for accurate signature.

AcquisitionOptimizer[source]#

Type alias for a function that returns the single point that maximizes an acquisition function over a search space or the V points that maximize a vectorized acquisition function (as represented by an acquisition-int tuple).

If this function receives a search space with points of shape [D] and an acquisition function with input shape […, 1, D] output shape […, 1], the AcquisitionOptimizer return shape should be [1, D].

If instead it receives a search space and a tuple containing the acquisition function and its vectorization V then the AcquisitionOptimizer return shape should be [V, D].

automatic_optimizer_selector(space: trieste.space.SearchSpace, target_func: trieste.acquisition.interface.AcquisitionFunction | Tuple[trieste.acquisition.interface.AcquisitionFunction, int]) → trieste.types.TensorType[source]#

A wrapper around our AcquisitionOptimizer`s. This class performs an :const:`AcquisitionOptimizer appropriate for the problem’s SearchSpace.

Parameters:

space – The space of points over which to search, for points with shape [D].
target_func – The function to maximise, with input shape […, 1, D] and output shape […, 1].

Returns:

The batch of points in space that maximises target_func, with shape [1, D].

optimize_discrete(space: trieste.space.GeneralDiscreteSearchSpace, target_func: trieste.acquisition.interface.AcquisitionFunction | Tuple[trieste.acquisition.interface.AcquisitionFunction, int]) → trieste.types.TensorType[source]#

An AcquisitionOptimizer for :class:’GeneralDiscreteSearchSpace’ spaces.

When this functions receives an acquisition-integer tuple as its target_func,it evaluates all the points in the search space for each of the individual V functions making up target_func.

Parameters:

space – The space of points over which to search, for points with shape [D].
target_func – The function to maximise, with input shape […, V, D] and output shape […, V].

Returns:

The V points in space that maximises target_func, with shape [V, D].

InitialPointSampler[source]#

Type alias for a function that returns initial point candidates for an optimization. Candidates are returned in one or more batches, and each batch should have the shape [N, D], even when N=1.

For simplicity and memory usage, it is recommended to define these as generators. For example, the following initial point sampler returns both a set of pre-optimized points and 50,000 random samples:

def sampler(space: SearchSpace) -> Iterable[TensorType]:
yield pre_optimized_points yield space.sample(50_000)

While the following does the same but groups the random samples into batches of size 1,000 to conserve memory:

def sampler(space: SearchSpace) -> Iterable[TensorType]:
yield pre_optimized_points yield from sample_from_space(50_000, batch_size=1_000)(space)

sample_from_space(num_samples: int, batch_size: int | None = None, vectorization: int = 1) → InitialPointSampler[source]#

An initial point sampler that just samples from the search pace.

Parameters:

num_samples – Number of samples to return.
batch_size – If specified, points are return in batches of this size, to preserve memory usage.
vectorization – Vectorization of the target function.

generate_initial_points(num_initial_points: int, initial_sampler: InitialPointSampler, space: trieste.space.SearchSpace, target_func: trieste.acquisition.interface.AcquisitionFunction, vectorization: int = 1) → trieste.types.TensorType[source]#

Return the best starting points for an optimization from those generated by a given sampler.

Parameters:

num_initial_points – Number of best starting points to return.
initial_sampler – Initial point sampler.
space – Search space.
target_func – Target function being optimized.
vectorization – Vectorization of the target function.

generate_continuous_optimizer(num_initial_samples: int | InitialPointSampler = NUM_SAMPLES_MIN, num_optimization_runs: int = 10, num_recovery_runs: int = 10, optimizer_args: dict[str, Any] | None = None) → AcquisitionOptimizer[trieste.space.Box | trieste.space.CollectionSearchSpace][source]#

Generate a gradient-based optimizer for :class:’Box’ and :class:’CollectionSearchSpace’ spaces and batches of size 1. In the case of a :class:’CollectionSearchSpace’, We perform gradient-based optimization across all :class:’Box’ subspaces, starting from the best location found across a sample of num_initial_samples random points.

We advise the user to either use the default NUM_SAMPLES_MIN for num_initial_samples, or NUM_SAMPLES_DIM times the dimensionality of the search space, whichever is greater. Similarly, for num_optimization_runs, we recommend using NUM_RUNS_DIM times the dimensionality of the search space.

This optimizer uses Scipy’s L-BFGS-B optimizer. We run num_optimization_runs separate optimizations in parallel, each starting from one of the best num_optimization_runs initial query points.

If all num_optimization_runs optimizations fail to converge then we run num_recovery_runs additional runs starting from random locations (also ran in parallel).

Note: using a large number of num_initial_samples and num_optimization_runs with a high-dimensional search space can consume a large amount of CPU memory (RAM).

Parameters:

num_initial_samples – The starting point(s) of the optimization. This can be either the number of random samples to use, or a function that given the search space returns the points to use. The latter can be used for example to add pre-optimized starting points to the random points, as well as to batch point generation to reduce memory usage for high-dimensional problems.
num_optimization_runs – The number of separate optimizations to run.
num_recovery_runs – The maximum number of recovery optimization runs in case of failure.
optimizer_args – The keyword arguments to pass to the Scipy L-BFGS-B optimizer. Check minimize method of optimize for details of which arguments can be passed. Note that method, jac and bounds cannot/should not be changed.

Returns:

The acquisition optimizer.

_perform_parallel_continuous_optimization(target_func: trieste.acquisition.interface.AcquisitionFunction, space: trieste.space.SearchSpace, starting_points: trieste.types.TensorType, optimizer_args: dict[str, Any]) → Tuple[trieste.types.TensorType, trieste.types.TensorType, trieste.types.TensorType, trieste.types.TensorType][source]#

A function to perform parallel optimization of our acquisition functions using Scipy. We perform L-BFGS-B starting from each of the locations contained in starting_points, i.e. the number of individual optimization runs is given by the leading dimension of starting_points.

To provide a parallel implementation of Scipy’s L-BFGS-B that can leverage batch calculations with TensorFlow, this function uses the Greenlet package to run each individual optimization on micro-threads.

L-BFGS-B updates for each individual optimization are performed by independent greenlets working with Numpy arrays, however, the evaluation of our acquisition function (and its gradients) is calculated in parallel (for each optimization step) using Tensorflow.

For :class:’CollectionSearchSpace’ we only apply gradient updates to its :class:’Box’ subspaces, fixing the discrete elements to the best values found across the initial random search. To fix these discrete elements, we optimize over a continuous :class:’Box’ relaxation of the discrete subspaces which has equal upper and lower bounds, i.e. we specify an equality constraint for this dimension in the scipy optimizer.

This function also support the maximization of vectorized target functions (with vectorization V).

Parameters:

target_func – The function(s) to maximise, with input shape […, V, D] and output shape […, V].
space – The original search space.
starting_points – The points at which to begin our optimizations of shape [num_optimization_runs, V, D]. The leading dimension of starting_points controls the number of individual optimization runs for each of the V target functions.
optimizer_args – Keyword arguments to pass to the Scipy optimizer.

Returns:

A tuple containing the failure statuses, maximum values, maximisers and number of evaluations for each of our optimizations.

class ScipyOptimizerGreenlet[source]#

Bases: greenlet.greenlet

Worker greenlet that runs a single Scipy L-BFGS-B (by default). Each greenlet performs all the optimizer update steps required for an individual optimization. However, the evaluation of our acquisition function (and its gradients) is delegated back to the main Tensorflow process (the parent greenlet) where evaluations can be made efficiently in parallel.

run(start: numpy.ndarray[Any, Any], bounds: scipy.optimize.Bounds, constraints: Sequence[trieste.space.Constraint], optimizer_args: dict[str, Any] | None = None) → scipy.optimize.OptimizeResult[source]#: Greenlet run method.

get_bounds_of_box_relaxation_around_point(space: trieste.space.TaggedProductSearchSpace, current_point: trieste.types.TensorType) → scipy.optimize.Bounds[source]#

A function to return the bounds of a continuous relaxation of a :class:’TaggedProductSearchSpace’ space, i.e. replacing discrete spaces with continuous spaces. In particular, all :class:’DiscreteSearchSpace’ subspaces are replaced with a new :class:’DiscreteSearchSpace’ fixed at their respective component of the specified ‘current_point’. Note that all :class:’Box’ subspaces remain the same.

Parameters:

space – The original search space.
current_point – The point at which to make the continuous relaxation.

Returns:

Bounds for the Scipy optimizer.

get_bounds_of_optimization(space: trieste.space.SearchSpace, starting_points: trieste.types.TensorType) → List[scipy.optimize.Bounds][source]#

Returns a list of bounds for all the optimization runs, to be used in the Scipy optimizer. The bounds are based on the provided search space and the starting points. The length of the list is equal to the total number of individual optimization runs, i.e. the number of starting points: num_optimization_runs x V.

For each starting point, the bounds are obtained as follows depending on the type of search space: - For a TaggedProductSearchSpace, a “continuous relaxation” of the discrete subspaces is built by creating bounds around the point. The discrete components of the created search space are fixed at the respective component of the point, and the remaining continuous components are set to the original bounds of the search space. - For a TaggedMultiSearchSpace, the bounds for each point are obtained in a similar manner as above, but based potentially on a different subspace for each point; instead of sharing a single set of bounds from the common search space. The subspaces are assigned to the optimization runs in a round-robin fashion along dimension 1 (of size V) of the starting points. An error is raised if size V is not a multiple of the number of subspaces. - For other types of search spaces, the original bounds are used for each optimization run.

Parameters:

space – The original search space.
starting_points – The points at which to begin the optimizations, with shape [num_optimization_runs, V, D]. The leading dimension of starting_points controls the number of individual optimization runs for each of the V starting points or target functions.

Returns:

A list of bounds for the Scipy optimizer. The length of the list is equal to the number of individual optimization runs, i.e. num_optimization_runs x V.

For example, for a 2D TaggedMultiSearchSpace with two subspaces, each with a continuous and a discrete component:

space = TaggedMultiSearchSpace(
    [
        TaggedProductSearchSpace(
            [Box([0], [1]), DiscreteSearchSpace([[11], [15], [21], [25], [31], [35]])]
        ),
        TaggedProductSearchSpace(
            [Box([2], [3]), DiscreteSearchSpace([[13], [17], [23], [27], [33], [37]])]
        )
    ]
)

Consider 2 optimization runs per point and a vectorization V of 4, for a total of 8 optimization runs. Given the following starting points:

starting_points = tf.constant(
    [
        [[10, 11], [12, 13], [14, 15], [16, 17]],
        [[20, 21], [22, 23], [24, 25], [26, 27]],
    ],
    dtype=tf.float64
)

The returned list of bounds for the optimization would be:

[
    spo.Bounds([0, 11], [1, 11]),  # For point at index [0, 0], using subspace 0
    spo.Bounds([2, 13], [3, 13]),  #         //         [0, 1],       //       1
    spo.Bounds([0, 15], [1, 15]),  #         //         [0, 2],       //       0
    spo.Bounds([2, 17], [3, 17]),  #         //         [0, 3],       //       1
    spo.Bounds([0, 21], [1, 21]),  #         //         [1, 0],       //       0
    spo.Bounds([2, 23], [3, 23]),  #         //         [1, 1],       //       1
    spo.Bounds([0, 25], [1, 25]),  #         //         [1, 2],       //       0
    spo.Bounds([2, 27], [3, 27]),  #         //         [1, 3],       //       1
]

batchify_joint(batch_size_one_optimizer: AcquisitionOptimizer[trieste.space.SearchSpaceType], batch_size: int) → AcquisitionOptimizer[trieste.space.SearchSpaceType][source]#

A wrapper around our AcquisitionOptimizer`s. This class wraps a :const:`AcquisitionOptimizer to allow it to jointly optimize the batch elements considered by a batch acquisition function.

Parameters:

batch_size_one_optimizer – An optimizer that returns only batch size one, i.e. produces a single point with shape [1, D].
batch_size – The number of points in the batch.

Returns:

An AcquisitionOptimizer that will provide a batch of points with shape [B, D].

batchify_vectorize(batch_size_one_optimizer: AcquisitionOptimizer[trieste.space.SearchSpaceType], batch_size: int) → AcquisitionOptimizer[trieste.space.SearchSpaceType][source]#

A wrapper around our AcquisitionOptimizer`s. This class wraps a :const:`AcquisitionOptimizer to allow it to optimize batch acquisition functions.

Unlike batchify_joint(), batchify_vectorize() is suitable for a AcquisitionFunction whose individual batch element can be optimized independently (i.e. they can be vectorized).

Parameters:

batch_size_one_optimizer – An optimizer that returns only batch size one, i.e. produces a single point with shape [1, D].
batch_size – The number of points in the batch.

Returns:

An AcquisitionOptimizer that will provide a batch of points with shape [V, D].

generate_random_search_optimizer(num_samples: int = NUM_SAMPLES_MIN) → AcquisitionOptimizer[trieste.space.SearchSpace][source]#

Generate an acquisition optimizer that samples num_samples random points across the space. The default is to sample at NUM_SAMPLES_MIN locations.

We advise the user to either use the default NUM_SAMPLES_MIN for num_samples, or NUM_SAMPLES_DIM times the dimensionality of the search space, whichever is smaller.

Parameters:: num_samples – The number of random points to sample.
Returns:: The acquisition optimizer.