trieste.models.gpflux#

This package contains the primary interface for deep Gaussian process models. It also contains a number of TrainableProbabilisticModel wrappers for GPflux-based models.

Submodules#

Package Contents#

build_vanilla_deep_gp(data: trieste.data.Dataset, search_space: trieste.space.SearchSpace, num_layers: int = NUM_LAYERS, num_inducing_points: int | None = None, inner_layer_sqrt_factor: float = INNER_LAYER_SQRT_FACTOR, likelihood_variance: float = LIKELIHOOD_VARIANCE, trainable_likelihood: bool = True) gpflux.models.DeepGP[source]#

Build a DeepGP model with sensible initial parameters. We found the default configuration used here to work well in most situation, but it should not be taken as a universally good solution.

Note that although we set all the relevant parameters to sensible values, we rely on build_constant_input_dim_deep_gp from architectures to build the model.

Parameters:
  • data – Dataset from the initial design, used to estimate the variance of observations and to provide query points which are used to determine inducing point locations with k-means.

  • search_space – Search space for performing Bayesian optimization. Used for initialization of inducing locations if num_inducing_points is larger than the amount of data.

  • num_layers – Number of layers in deep GP. By default set to NUM_LAYERS.

  • num_inducing_points – Number of inducing points to use in each layer. If left unspecified (default), this number is set to either NUM_INDUCING_POINTS_PER_DIM``*dimensionality of the search space or value given by ``MAX_NUM_INDUCING_POINTS, whichever is smaller.

  • inner_layer_sqrt_factor – A multiplicative factor used to rescale hidden layers, see Config for details. By default set to INNER_LAYER_SQRT_FACTOR.

  • likelihood_variance – Initial noise variance in the likelihood function, see Config for details. By default set to LIKELIHOOD_VARIANCE.

  • trainable_likelihood – Trainable likelihood variance.

Returns:

A DeepGP model with sensible default settings.

Raise:

If non-positive num_layers, inner_layer_sqrt_factor, likelihood_variance or num_inducing_points is provided.

class GPfluxPredictor(optimizer: trieste.models.optimizer.KerasOptimizer | None = None, encoder: trieste.space.EncoderFunction | None = None)[source]#

Bases: trieste.models.interfaces.SupportsGetObservationNoise, trieste.models.interfaces.EncodedSupportsPredictY, abc.ABC

A trainable wrapper for a GPflux deep Gaussian process model. The code assumes subclasses will use the Keras fit method for training, and so they should provide access to both a model_keras and model_gpflux.

Parameters:
  • optimizer – The optimizer wrapper containing the optimizer with which to train the model and arguments for the wrapper and the optimizer. The optimizer must be an instance of a Optimizer. Defaults to Adam optimizer with 0.01 learning rate.

  • encoder – Optional encoder with which to transform query points before generating predictions.

property encoder: trieste.space.EncoderFunction | None#

Query point encoder.

property model_gpflux: gpflow.base.Module#
Abstractmethod:

The underlying GPflux model.

property model_keras: gpflow.keras.tf_keras.Model#
Abstractmethod:

Returns the compiled Keras model for training.

property optimizer: trieste.models.optimizer.KerasOptimizer#

The optimizer wrapper for training the model.

predict_encoded(query_points: trieste.types.TensorType) tuple[trieste.types.TensorType, trieste.types.TensorType][source]#

Note: unless otherwise noted, this returns the mean and variance of the last layer conditioned on one sample from the previous layers.

abstract sample_encoded(query_points: trieste.types.TensorType, num_samples: int) trieste.types.TensorType[source]#

Implementation of sample on encoded query points.

predict_y_encoded(query_points: trieste.types.TensorType) tuple[trieste.types.TensorType, trieste.types.TensorType][source]#

Note: unless otherwise noted, this will return the prediction conditioned on one sample from the lower layers.

get_observation_noise() trieste.types.TensorType[source]#

Return the variance of observation noise for homoscedastic likelihoods.

Returns:

The observation noise.

Raises:

NotImplementedError – If the model does not have a homoscedastic likelihood.

class DeepGaussianProcess(model: gpflux.models.DeepGP | Callable[[], gpflux.models.DeepGP], optimizer: trieste.models.optimizer.KerasOptimizer | None = None, num_rff_features: int = 1000, continuous_optimisation: bool = True, compile_args: Mapping[str, Any] | None = None, encoder: trieste.space.EncoderFunction | None = None)[source]#

Bases: trieste.models.gpflux.interface.GPfluxPredictor, trieste.models.interfaces.EncodedTrainableProbabilisticModel, trieste.models.interfaces.HasReparamSampler, trieste.models.interfaces.HasTrajectorySampler

A TrainableProbabilisticModel wrapper for a GPflux DeepGP with GPLayer or LatentVariableLayer: this class does not support e.g. keras layers. We provide simple architectures that can be used with this class in the architectures.py file.

Parameters:
  • model – The underlying GPflux deep Gaussian process model. Passing in a named closure rather than a model can help when copying or serialising.

  • optimizer – The optimizer wrapper with necessary specifications for compiling and training the model. Defaults to KerasOptimizer with Adam optimizer, mean squared error metric and a dictionary of default arguments for the Keras fit method: 400 epochs, batch size of 1000, and verbose 0. A custom callback that reduces the optimizer learning rate is used as well. See https://keras.io/api/models/model_training_apis/#fit-method for a list of possible arguments.

  • num_rff_features – The number of random Fourier features used to approximate the kernel when calling trajectory_sampler(). We use a default of 1000 as it typically performs well for a wide range of kernels. Note that very smooth kernels (e.g. RBF) can be well-approximated with fewer features.

  • continuous_optimisation – if True (default), the optimizer will keep track of the number of epochs across BO iterations and use this number as initial_epoch. This is essential to allow monitoring of model training across BO iterations.

  • compile_args – Keyword arguments to pass to the compile method of the Keras model (Model). See https://keras.io/api/models/model_training_apis/#compile-method for a list of possible arguments. The optimizer and metrics arguments must not be included.

  • encoder – Optional encoder with which to transform query points before generating predictions.

Raises:

ValueError – If model has unsupported layers, num_rff_features is less than 0, if the optimizer is not of a supported type, or compile_args contains disallowed arguments.

__repr__() str[source]#

Return repr(self).

property model_gpflux: gpflux.models.DeepGP#

The underlying GPflux model.

property model_keras: gpflow.keras.tf_keras.Model#

Returns the compiled Keras model for training.

sample_encoded(query_points: trieste.types.TensorType, num_samples: int) trieste.types.TensorType[source]#

Implementation of sample on encoded query points.

reparam_sampler(num_samples: int) trieste.models.interfaces.ReparametrizationSampler[trieste.models.gpflux.interface.GPfluxPredictor][source]#

Return a reparametrization sampler for a DeepGaussianProcess model.

Parameters:

num_samples – The number of samples to obtain.

Returns:

The reparametrization sampler.

trajectory_sampler() trieste.models.interfaces.TrajectorySampler[trieste.models.gpflux.interface.GPfluxPredictor][source]#

Return a trajectory sampler. For DeepGaussianProcess, we build trajectories using the GPflux default sampler.

Returns:

The trajectory sampler.

update_encoded(dataset: trieste.data.Dataset) None[source]#

Implementation of update on the encoded dataset.

optimize_encoded(dataset: trieste.data.Dataset) gpflow.keras.tf_keras.callbacks.History[source]#

Optimize the model with the specified dataset. :param dataset: The data with which to optimize the model.

log(dataset: trieste.data.Dataset | None = None) None[source]#

Log model training information at a given optimization step to the Tensorboard. We log a few summary statistics of losses, layer KL divergences and metrics (as provided in optimizer): final value at the end of the training, diff value as a difference between inital and final epoch. We also log epoch statistics, but as histograms, rather than time series. We also log several training data based metrics, such as root mean square error between predictions and observations and several others.

For custom logs user will need to subclass the model and overwrite this method.

Parameters:

dataset – Optional data that can be used to log additional data-based model summaries.

class DeepGaussianProcessDecoupledLayer(model: trieste.models.gpflux.interface.GPfluxPredictor, layer_number: int, num_features: int = 1000)[source]#

Bases: abc.ABC

Layer that samples an approximate decoupled trajectory for a GPflux GPLayer using Matheron’s rule ([WBT+20]). Note that the only multi-output kernel that is supported is a SharedIndependent kernel.

Parameters:
  • model – The model to sample from.

  • layer_number – The index of the layer that we wish to sample from.

  • num_features – The number of features to use in the random feature approximation.

Raises:

ValueError (or InvalidArgumentError) – If the layer is not a GPLayer, the layer’s kernel is not supported, or if num_features is not positive.

__call__(x: trieste.types.TensorType) trieste.types.TensorType[source]#

Evaluate trajectory function for layer at input.

Parameters:

x – Input location with shape [N, B, D], where N is the number of points, B is the batch dimension, and D is the input dimensionality.

Returns:

Trajectory for the layer evaluated at the input, with shape [N, B, P], where P is the number of latent GPs in the layer.

Raises:

InvalidArgumentError – If the provided batch size does not match with the layer’s batch size.

resample() None[source]#

Efficiently resample in-place without retracing.

update() None[source]#

Efficiently update the trajectory with a new weight distribution and resample its weights.

_prepare_weight_sampler() Callable[[int], trieste.types.TensorType][source]#

Prepare the sampler function that provides samples of the feature weights for both the RFF and canonical feature functions, i.e. we return a function that takes in a batch size B and returns B samples for the weights of each of the L RFF features and M canonical features for P outputs.

class DeepGaussianProcessDecoupledTrajectorySampler(model: trieste.models.gpflux.interface.GPfluxPredictor, num_features: int = 1000)[source]#

Bases: trieste.models.interfaces.TrajectorySampler[trieste.models.gpflux.interface.GPfluxPredictor]

This sampler employs decoupled sampling (see [WBT+20]) to build functions that approximate a trajectory sampled from an underlying deep Gaussian process model. In particular, this sampler provides trajectory functions for GPfluxPredictors with underlying DeepGP models by using a feature decomposition using both random Fourier features and canonical features centered at inducing point locations. This allows for cheap approximate trajectory samples, as opposed to exact trajectory sampling, which scales cubically in the number of query points.

Parameters:
  • model – The model to sample from.

  • num_features – The number of random Fourier features to use.

Raises:

ValueError (or InvalidArgumentError) – If the model is not a GPfluxPredictor, or its underlying model_gpflux is not a DeepGP, or num_features is not positive.

get_trajectory() trieste.models.interfaces.TrajectoryFunction[source]#

Generate an approximate function draw (trajectory) from the deep GP model.

Returns:

A trajectory function representing an approximate trajectory from the deep GP, taking an input of shape [N, B, D] and returning shape [N, B, L].

update_trajectory(trajectory: trieste.models.interfaces.TrajectoryFunction) trieste.models.interfaces.TrajectoryFunction[source]#

Efficiently update a TrajectoryFunction to reflect an update in its underlying ProbabilisticModel and resample accordingly.

Parameters:

trajectory – The trajectory function to be updated and resampled.

Returns:

The updated and resampled trajectory function.

Raises:

InvalidArgumentError – If trajectory is not a dgp_feature_decomposition_trajectory

resample_trajectory(trajectory: trieste.models.interfaces.TrajectoryFunction) trieste.models.interfaces.TrajectoryFunction[source]#

Efficiently resample a TrajectoryFunction in-place to avoid function retracing with every new sample.

Parameters:

trajectory – The trajectory function to be resampled.

Returns:

The new resampled trajectory function.

Raises:

InvalidArgumentError – If trajectory is not a dgp_feature_decomposition_trajectory

class DeepGaussianProcessReparamSampler(sample_size: int, model: trieste.models.gpflux.interface.GPfluxPredictor)[source]#

Bases: trieste.models.interfaces.ReparametrizationSampler[trieste.models.gpflux.interface.GPfluxPredictor]

This sampler employs the reparameterization trick to approximate samples from a GPfluxPredictor‘s predictive distribution, when the GPfluxPredictor has an underlying DeepGP.

Parameters:
  • sample_size – The number of samples for each batch of points. Must be positive.

  • model – The model to sample from.

Raises:

ValueError (or InvalidArgumentError) – If sample_size is not positive, if the model is not a GPfluxPredictor, of if its underlying model_gpflux is not a DeepGP.

sample(at: trieste.types.TensorType, *, jitter: float = DEFAULTS.JITTER) trieste.types.TensorType[source]#

Return approximate samples from the model specified at __init__(). Multiple calls to sample(), for any given DeepGaussianProcessReparamSampler and at, will produce the exact same samples. Calls to sample() on different DeepGaussianProcessReparamSampler instances will produce different samples.

Parameters:
  • at – Where to sample the predictive distribution, with shape […, 1, D], for points of dimension D.

  • jitter – The size of the jitter to use when stabilizing the Cholesky decomposition of the covariance matrix.

Returns:

The samples, of shape […, S, 1, L], where S is the sample_size and L is the number of latent model dimensions.

Raises:

ValueError (or InvalidArgumentError) – If at has an invalid shape or jitter is negative.

class ResampleableDecoupledDeepGaussianProcessFeatureFunctions(layer: gpflux.layers.GPLayer, n_components: int)[source]#

Bases: gpflux.layers.basis_functions.fourier_features.RandomFourierFeaturesCosine

A wrapper around GPflux’s random Fourier feature function that allows for efficient in-place updating when generating new decompositions. In addition to providing Fourier features, this class concatenates a layer’s Fourier feature expansion with evaluations of the canonical basis functions.

Parameters:
  • layer – The layer that will be approximated by the feature functions.

  • n_components – The number of features.

Raises:

ValueError – If the layer is not a GPLayer.

resample() None[source]#

Resample weights and biases.

__call__(x: trieste.types.TensorType) trieste.types.TensorType[source]#

Evaluate and combine prior basis functions and canonical basic functions at the input.

class dgp_feature_decomposition_trajectory(model: trieste.models.gpflux.interface.GPfluxPredictor, num_features: int)[source]#

Bases: trieste.models.interfaces.TrajectoryFunctionClass

An approximate sample from a deep Gaussian process’s posterior, where the samples are represented as a finite weighted sum of features. This class essentially takes a list of DeepGaussianProcessDecoupledLayers and iterates through them to sample, update and resample.

Parameters:
  • model – The model to sample from.

  • num_features – The number of random Fourier features to use.

__call__(x: trieste.types.TensorType) trieste.types.TensorType[source]#

Call trajectory function by looping through layers.

Parameters:

x – Input location with shape [N, B, D], where N is the number of points, B is the batch dimension, and D is the input dimensionality.

Returns:

Trajectory samples with shape [N, B, L], where L is the number of outputs.

update() None[source]#

Update the layers with new features and weights.

resample() None[source]#

Resample the layer weights.