trieste.models.gpflux#
This package contains the primary interface for deep Gaussian process models. It also contains a
number of TrainableProbabilisticModel wrappers for GPflux-based models.
Submodules#
Package Contents#
- build_vanilla_deep_gp(data: trieste.data.Dataset, search_space: trieste.space.SearchSpace, num_layers: int = NUM_LAYERS, num_inducing_points: int | None = None, inner_layer_sqrt_factor: float = INNER_LAYER_SQRT_FACTOR, likelihood_variance: float = LIKELIHOOD_VARIANCE, trainable_likelihood: bool = True) gpflux.models.DeepGP[source]#
Build a
DeepGPmodel with sensible initial parameters. We found the default configuration used here to work well in most situation, but it should not be taken as a universally good solution.Note that although we set all the relevant parameters to sensible values, we rely on
build_constant_input_dim_deep_gpfromarchitecturesto build the model.- Parameters:
data – Dataset from the initial design, used to estimate the variance of observations and to provide query points which are used to determine inducing point locations with k-means.
search_space – Search space for performing Bayesian optimization. Used for initialization of inducing locations if
num_inducing_pointsis larger than the amount of data.num_layers – Number of layers in deep GP. By default set to
NUM_LAYERS.num_inducing_points – Number of inducing points to use in each layer. If left unspecified (default), this number is set to either
NUM_INDUCING_POINTS_PER_DIM``*dimensionality of the search space or value given by ``MAX_NUM_INDUCING_POINTS, whichever is smaller.inner_layer_sqrt_factor – A multiplicative factor used to rescale hidden layers, see
Configfor details. By default set toINNER_LAYER_SQRT_FACTOR.likelihood_variance – Initial noise variance in the likelihood function, see
Configfor details. By default set toLIKELIHOOD_VARIANCE.trainable_likelihood – Trainable likelihood variance.
- Returns:
A
DeepGPmodel with sensible default settings.- Raise:
If non-positive
num_layers,inner_layer_sqrt_factor,likelihood_varianceornum_inducing_pointsis provided.
- class GPfluxPredictor(optimizer: trieste.models.optimizer.KerasOptimizer | None = None, encoder: trieste.space.EncoderFunction | None = None)[source]#
Bases:
trieste.models.interfaces.SupportsGetObservationNoise,trieste.models.interfaces.EncodedSupportsPredictY,abc.ABCA trainable wrapper for a GPflux deep Gaussian process model. The code assumes subclasses will use the Keras fit method for training, and so they should provide access to both a model_keras and model_gpflux.
- Parameters:
optimizer – The optimizer wrapper containing the optimizer with which to train the model and arguments for the wrapper and the optimizer. The optimizer must be an instance of a
Optimizer. Defaults toAdamoptimizer with 0.01 learning rate.encoder – Optional encoder with which to transform query points before generating predictions.
- property model_gpflux: gpflow.base.Module#
- Abstractmethod:
The underlying GPflux model.
- property model_keras: gpflow.keras.tf_keras.Model#
- Abstractmethod:
Returns the compiled Keras model for training.
- property optimizer: trieste.models.optimizer.KerasOptimizer#
The optimizer wrapper for training the model.
- predict_encoded(query_points: trieste.types.TensorType) tuple[trieste.types.TensorType, trieste.types.TensorType][source]#
Note: unless otherwise noted, this returns the mean and variance of the last layer conditioned on one sample from the previous layers.
- abstract sample_encoded(query_points: trieste.types.TensorType, num_samples: int) trieste.types.TensorType[source]#
Implementation of sample on encoded query points.
- predict_y_encoded(query_points: trieste.types.TensorType) tuple[trieste.types.TensorType, trieste.types.TensorType][source]#
Note: unless otherwise noted, this will return the prediction conditioned on one sample from the lower layers.
- get_observation_noise() trieste.types.TensorType[source]#
Return the variance of observation noise for homoscedastic likelihoods.
- Returns:
The observation noise.
- Raises:
NotImplementedError – If the model does not have a homoscedastic likelihood.
- class DeepGaussianProcess(model: gpflux.models.DeepGP | Callable[[], gpflux.models.DeepGP], optimizer: trieste.models.optimizer.KerasOptimizer | None = None, num_rff_features: int = 1000, continuous_optimisation: bool = True, compile_args: Mapping[str, Any] | None = None, encoder: trieste.space.EncoderFunction | None = None)[source]#
Bases:
trieste.models.gpflux.interface.GPfluxPredictor,trieste.models.interfaces.EncodedTrainableProbabilisticModel,trieste.models.interfaces.HasReparamSampler,trieste.models.interfaces.HasTrajectorySamplerA
TrainableProbabilisticModelwrapper for a GPfluxDeepGPwithGPLayerorLatentVariableLayer: this class does not support e.g. keras layers. We provide simple architectures that can be used with this class in the architectures.py file.- Parameters:
model – The underlying GPflux deep Gaussian process model. Passing in a named closure rather than a model can help when copying or serialising.
optimizer – The optimizer wrapper with necessary specifications for compiling and training the model. Defaults to
KerasOptimizerwithAdamoptimizer, mean squared error metric and a dictionary of default arguments for the Keras fit method: 400 epochs, batch size of 1000, and verbose 0. A custom callback that reduces the optimizer learning rate is used as well. See https://keras.io/api/models/model_training_apis/#fit-method for a list of possible arguments.num_rff_features – The number of random Fourier features used to approximate the kernel when calling
trajectory_sampler(). We use a default of 1000 as it typically performs well for a wide range of kernels. Note that very smooth kernels (e.g. RBF) can be well-approximated with fewer features.continuous_optimisation – if True (default), the optimizer will keep track of the number of epochs across BO iterations and use this number as initial_epoch. This is essential to allow monitoring of model training across BO iterations.
compile_args – Keyword arguments to pass to the
compilemethod of the Keras model (Model). See https://keras.io/api/models/model_training_apis/#compile-method for a list of possible arguments. Theoptimizerandmetricsarguments must not be included.encoder – Optional encoder with which to transform query points before generating predictions.
- Raises:
ValueError – If
modelhas unsupported layers,num_rff_featuresis less than 0, if theoptimizeris not of a supported type, or compile_args contains disallowed arguments.
- property model_gpflux: gpflux.models.DeepGP#
The underlying GPflux model.
- property model_keras: gpflow.keras.tf_keras.Model#
Returns the compiled Keras model for training.
- sample_encoded(query_points: trieste.types.TensorType, num_samples: int) trieste.types.TensorType[source]#
Implementation of sample on encoded query points.
- reparam_sampler(num_samples: int) trieste.models.interfaces.ReparametrizationSampler[trieste.models.gpflux.interface.GPfluxPredictor][source]#
Return a reparametrization sampler for a
DeepGaussianProcessmodel.- Parameters:
num_samples – The number of samples to obtain.
- Returns:
The reparametrization sampler.
- trajectory_sampler() trieste.models.interfaces.TrajectorySampler[trieste.models.gpflux.interface.GPfluxPredictor][source]#
Return a trajectory sampler. For
DeepGaussianProcess, we build trajectories using the GPflux default sampler.- Returns:
The trajectory sampler.
- update_encoded(dataset: trieste.data.Dataset) None[source]#
Implementation of update on the encoded dataset.
- optimize_encoded(dataset: trieste.data.Dataset) gpflow.keras.tf_keras.callbacks.History[source]#
Optimize the model with the specified dataset. :param dataset: The data with which to optimize the model.
- log(dataset: trieste.data.Dataset | None = None) None[source]#
Log model training information at a given optimization step to the Tensorboard. We log a few summary statistics of losses, layer KL divergences and metrics (as provided in
optimizer):finalvalue at the end of the training,diffvalue as a difference between inital and final epoch. We also log epoch statistics, but as histograms, rather than time series. We also log several training data based metrics, such as root mean square error between predictions and observations and several others.For custom logs user will need to subclass the model and overwrite this method.
- Parameters:
dataset – Optional data that can be used to log additional data-based model summaries.
- class DeepGaussianProcessDecoupledLayer(model: trieste.models.gpflux.interface.GPfluxPredictor, layer_number: int, num_features: int = 1000)[source]#
Bases:
abc.ABCLayer that samples an approximate decoupled trajectory for a GPflux
GPLayerusing Matheron’s rule ([WBT+20]). Note that the only multi-output kernel that is supported is aSharedIndependentkernel.- Parameters:
model – The model to sample from.
layer_number – The index of the layer that we wish to sample from.
num_features – The number of features to use in the random feature approximation.
- Raises:
ValueError (or InvalidArgumentError) – If the layer is not a
GPLayer, the layer’s kernel is not supported, or ifnum_featuresis not positive.
- __call__(x: trieste.types.TensorType) trieste.types.TensorType[source]#
Evaluate trajectory function for layer at input.
- Parameters:
x – Input location with shape [N, B, D], where N is the number of points, B is the batch dimension, and D is the input dimensionality.
- Returns:
Trajectory for the layer evaluated at the input, with shape [N, B, P], where P is the number of latent GPs in the layer.
- Raises:
InvalidArgumentError – If the provided batch size does not match with the layer’s batch size.
- update() None[source]#
Efficiently update the trajectory with a new weight distribution and resample its weights.
- _prepare_weight_sampler() Callable[[int], trieste.types.TensorType][source]#
Prepare the sampler function that provides samples of the feature weights for both the RFF and canonical feature functions, i.e. we return a function that takes in a batch size B and returns B samples for the weights of each of the L RFF features and M canonical features for P outputs.
- class DeepGaussianProcessDecoupledTrajectorySampler(model: trieste.models.gpflux.interface.GPfluxPredictor, num_features: int = 1000)[source]#
Bases:
trieste.models.interfaces.TrajectorySampler[trieste.models.gpflux.interface.GPfluxPredictor]This sampler employs decoupled sampling (see [WBT+20]) to build functions that approximate a trajectory sampled from an underlying deep Gaussian process model. In particular, this sampler provides trajectory functions for
GPfluxPredictors with underlyingDeepGPmodels by using a feature decomposition using both random Fourier features and canonical features centered at inducing point locations. This allows for cheap approximate trajectory samples, as opposed to exact trajectory sampling, which scales cubically in the number of query points.- Parameters:
model – The model to sample from.
num_features – The number of random Fourier features to use.
- Raises:
ValueError (or InvalidArgumentError) – If the model is not a
GPfluxPredictor, or its underlyingmodel_gpfluxis not aDeepGP, ornum_featuresis not positive.
- get_trajectory() trieste.models.interfaces.TrajectoryFunction[source]#
Generate an approximate function draw (trajectory) from the deep GP model.
- Returns:
A trajectory function representing an approximate trajectory from the deep GP, taking an input of shape [N, B, D] and returning shape [N, B, L].
- update_trajectory(trajectory: trieste.models.interfaces.TrajectoryFunction) trieste.models.interfaces.TrajectoryFunction[source]#
Efficiently update a
TrajectoryFunctionto reflect an update in its underlyingProbabilisticModeland resample accordingly.- Parameters:
trajectory – The trajectory function to be updated and resampled.
- Returns:
The updated and resampled trajectory function.
- Raises:
InvalidArgumentError – If
trajectoryis not adgp_feature_decomposition_trajectory
- resample_trajectory(trajectory: trieste.models.interfaces.TrajectoryFunction) trieste.models.interfaces.TrajectoryFunction[source]#
Efficiently resample a
TrajectoryFunctionin-place to avoid function retracing with every new sample.- Parameters:
trajectory – The trajectory function to be resampled.
- Returns:
The new resampled trajectory function.
- Raises:
InvalidArgumentError – If
trajectoryis not adgp_feature_decomposition_trajectory
- class DeepGaussianProcessReparamSampler(sample_size: int, model: trieste.models.gpflux.interface.GPfluxPredictor)[source]#
Bases:
trieste.models.interfaces.ReparametrizationSampler[trieste.models.gpflux.interface.GPfluxPredictor]This sampler employs the reparameterization trick to approximate samples from a
GPfluxPredictor‘s predictive distribution, when theGPfluxPredictorhas an underlyingDeepGP.- Parameters:
sample_size – The number of samples for each batch of points. Must be positive.
model – The model to sample from.
- Raises:
ValueError (or InvalidArgumentError) – If
sample_sizeis not positive, if the model is not aGPfluxPredictor, of if its underlyingmodel_gpfluxis not aDeepGP.
- sample(at: trieste.types.TensorType, *, jitter: float = DEFAULTS.JITTER) trieste.types.TensorType[source]#
Return approximate samples from the model specified at
__init__(). Multiple calls tosample(), for any givenDeepGaussianProcessReparamSamplerandat, will produce the exact same samples. Calls tosample()on differentDeepGaussianProcessReparamSamplerinstances will produce different samples.- Parameters:
at – Where to sample the predictive distribution, with shape […, 1, D], for points of dimension D.
jitter – The size of the jitter to use when stabilizing the Cholesky decomposition of the covariance matrix.
- Returns:
The samples, of shape […, S, 1, L], where S is the sample_size and L is the number of latent model dimensions.
- Raises:
ValueError (or InvalidArgumentError) – If
athas an invalid shape orjitteris negative.
- class ResampleableDecoupledDeepGaussianProcessFeatureFunctions(layer: gpflux.layers.GPLayer, n_components: int)[source]#
Bases:
gpflux.layers.basis_functions.fourier_features.RandomFourierFeaturesCosineA wrapper around GPflux’s random Fourier feature function that allows for efficient in-place updating when generating new decompositions. In addition to providing Fourier features, this class concatenates a layer’s Fourier feature expansion with evaluations of the canonical basis functions.
- Parameters:
layer – The layer that will be approximated by the feature functions.
n_components – The number of features.
- Raises:
ValueError – If the layer is not a
GPLayer.
- class dgp_feature_decomposition_trajectory(model: trieste.models.gpflux.interface.GPfluxPredictor, num_features: int)[source]#
Bases:
trieste.models.interfaces.TrajectoryFunctionClassAn approximate sample from a deep Gaussian process’s posterior, where the samples are represented as a finite weighted sum of features. This class essentially takes a list of
DeepGaussianProcessDecoupledLayers and iterates through them to sample, update and resample.- Parameters:
model – The model to sample from.
num_features – The number of random Fourier features to use.
- __call__(x: trieste.types.TensorType) trieste.types.TensorType[source]#
Call trajectory function by looping through layers.
- Parameters:
x – Input location with shape [N, B, D], where N is the number of points, B is the batch dimension, and D is the input dimensionality.
- Returns:
Trajectory samples with shape [N, B, L], where L is the number of outputs.