trieste.models.gpflow.sampler#

This module is the home of the sampling functionality required by Trieste’s GPflow wrappers.

Module Contents#

qmc_normal_samples(num_samples: _IntTensorType, n_sample_dim: _IntTensorType, skip: _IntTensorType = 0, dtype: tensorflow.DType = tf.float64) tensorflow.Tensor[source]#

Generates num_samples sobol samples, skipping the first skip, where each sample has dimension n_sample_dim.

class IndependentReparametrizationSampler(sample_size: int, model: trieste.models.interfaces.ProbabilisticModel, qmc: bool = False, qmc_skip: bool = True)[source]#

Bases: trieste.models.interfaces.ReparametrizationSampler[trieste.models.interfaces.ProbabilisticModel]

This sampler employs the reparameterization trick to approximate samples from a ProbabilisticModel‘s predictive distribution as

\[x \mapsto \mu(x) + \epsilon \sigma(x)\]

where \(\epsilon \sim \mathcal N (0, 1)\) is constant for a given sampler, thus ensuring samples form a continuous curve.

Parameters:
  • sample_size – The number of samples to take at each point. Must be positive.

  • model – The model to sample from.

  • qmc – Whether to use QMC sobol sampling instead of random normal sampling. QMC sampling more accurately approximates a normal distribution than truly random samples.

  • qmc_skip – Whether to use the skip parameter to ensure the QMC sampler gives different samples whenever it is reset. This is not supported with XLA.

Raises:

ValueError (or InvalidArgumentError) – If sample_size is not positive.

skip: trieste.types.TensorType[source]#

Number of sobol sequence points to skip. This is incremented for each sampler.

sample(at: trieste.types.TensorType, *, jitter: float = DEFAULTS.JITTER) trieste.types.TensorType[source]#

Return approximate samples from the model specified at __init__(). Multiple calls to sample(), for any given IndependentReparametrizationSampler and at, will produce the exact same samples. Calls to sample() on different IndependentReparametrizationSampler instances will produce different samples.

Parameters:
  • at – Where to sample the predictive distribution, with shape […, 1, D], for points of dimension D.

  • jitter – The size of the jitter to use when stabilising the Cholesky decomposition of the covariance matrix.

Returns:

The samples, of shape […, S, 1, L], where S is the sample_size and L is the number of latent model dimensions.

Raises:

ValueError (or InvalidArgumentError) – If at has an invalid shape or jitter is negative.

class BatchReparametrizationSampler(sample_size: int, model: trieste.models.interfaces.SupportsPredictJoint, qmc: bool = False, qmc_skip: bool = True)[source]#

Bases: trieste.models.interfaces.ReparametrizationSampler[trieste.models.interfaces.SupportsPredictJoint]

This sampler employs the reparameterization trick to approximate batches of samples from a ProbabilisticModel‘s predictive joint distribution as

\[x \mapsto \mu(x) + \epsilon L(x)\]

where \(L\) is the Cholesky factor s.t. \(LL^T\) is the covariance, and \(\epsilon \sim \mathcal N (0, 1)\) is constant for a given sampler, thus ensuring samples form a continuous curve.

Parameters:
  • sample_size – The number of samples for each batch of points. Must be positive.

  • model – The model to sample from.

  • qmc – Whether to use QMC sobol sampling instead of random normal sampling. QMC sampling more accurately approximates a normal distribution than truly random samples.

  • qmc_skip – Whether to use the skip parameter to ensure the QMC sampler gives different samples whenever it is reset. This is not supported with XLA.

Raises:

ValueError (or InvalidArgumentError) – If sample_size is not positive.

skip: trieste.types.TensorType[source]#

Number of sobol sequence points to skip. This is incremented for each sampler.

sample(at: trieste.types.TensorType, *, jitter: float = DEFAULTS.JITTER) trieste.types.TensorType[source]#

Return approximate samples from the model specified at __init__(). Multiple calls to sample(), for any given BatchReparametrizationSampler and at, will produce the exact same samples. Calls to sample() on different BatchReparametrizationSampler instances will produce different samples.

Parameters:
  • at – Batches of query points at which to sample the predictive distribution, with shape […, B, D], for batches of size B of points of dimension D. Must have a consistent batch size across all calls to sample() for any given BatchReparametrizationSampler.

  • jitter – The size of the jitter to use when stabilising the Cholesky decomposition of the covariance matrix.

Returns:

The samples, of shape […, S, B, L], where S is the sample_size, B the number of points per batch, and L the dimension of the model’s predictive distribution.

Raises:

ValueError (or InvalidArgumentError) – If any of the following are true: - at is a scalar. - The batch size B of at is not positive. - The batch size B of at differs from that of previous calls. - jitter is negative.

class FeatureDecompositionInternalDataModel[source]#

Bases: trieste.models.interfaces.SupportsGetKernel, trieste.models.interfaces.SupportsGetMeanFunction, trieste.models.interfaces.SupportsGetObservationNoise, trieste.models.interfaces.SupportsGetInternalData, typing_extensions.Protocol

A probabilistic model that supports get_kernel, get_mean_function, get_observation_noise and get_internal_data methods.

class FeatureDecompositionInducingPointModel[source]#

Bases: trieste.models.interfaces.SupportsGetKernel, trieste.models.interfaces.SupportsGetMeanFunction, trieste.models.interfaces.SupportsGetInducingVariables, typing_extensions.Protocol

A probabilistic model that supports get_kernel, get_mean_function and get_inducing_point methods.

class FeatureDecompositionTrajectorySampler(model: FeatureDecompositionTrajectorySamplerModelType, feature_functions: ResampleableRandomFourierFeatureFunctions)[source]#

Bases: trieste.models.interfaces.TrajectorySampler[FeatureDecompositionTrajectorySamplerModelType], abc.ABC

This is a general class to build functions that approximate a trajectory sampled from an underlying Gaussian process model.

In particular, we approximate the Gaussian processes’ posterior samples as the finite feature approximation

\[\hat{f}(x) = \sum_{i=1}^m \phi_i(x)\theta_i\]

where \(\phi_i\) are m features and \(\theta_i\) are feature weights sampled from a given distribution

Achieving consistency (ensuring that the same sample draw for all evalutions of a particular trajectory function) for exact sample draws from a GP is prohibitively costly because it scales cubically with the number of query points. However, finite feature representations can be evaluated with constant cost regardless of the required number of queries.

Parameters:

model – The model to sample from.

Raises:

ValueError – If dataset is empty.

get_trajectory() trieste.models.interfaces.TrajectoryFunction[source]#

Generate an approximate function draw (trajectory) by sampling weights and evaluating the feature functions.

Returns:

A trajectory function representing an approximate trajectory from the Gaussian process, taking an input of shape [N, B, D] and returning shape [N, B, L] where L is the number of outputs of the model.

update_trajectory(trajectory: trieste.models.interfaces.TrajectoryFunction) trieste.models.interfaces.TrajectoryFunction[source]#

Efficiently update a TrajectoryFunction to reflect an update in its underlying ProbabilisticModel and resample accordingly.

For a FeatureDecompositionTrajectorySampler, updating the sampler corresponds to resampling the feature functions (taking into account any changed kernel parameters) and recalculating the weight distribution.

Parameters:

trajectory – The trajectory function to be resampled.

Returns:

The new resampled trajectory function.

resample_trajectory(trajectory: trieste.models.interfaces.TrajectoryFunction) trieste.models.interfaces.TrajectoryFunction[source]#

Efficiently resample a TrajectoryFunction in-place to avoid function retracing with every new sample.

Parameters:

trajectory – The trajectory function to be resampled.

Returns:

The new resampled trajectory function.

abstract _prepare_weight_sampler() Callable[[int], trieste.types.TensorType][source]#

Calculate the posterior of the feature weights for the specified feature functions, returning a function that takes in a batch size B and returns B samples for the weights of each of the F features for L outputs.

class RandomFourierFeatureTrajectorySampler(model: FeatureDecompositionInternalDataModel, num_features: int = 1000)[source]#

Bases: FeatureDecompositionTrajectorySampler[FeatureDecompositionInternalDataModel]

This class builds functions that approximate a trajectory sampled from an underlying Gaussian process model. For tractibility, the Gaussian process is approximated with a Bayesian Linear model across a set of features sampled from the Fourier feature decomposition of the model’s kernel. See [HernandezLHG14] for details. Currently we do not support models with multiple latent Gaussian processes.

In particular, we approximate the Gaussian processes’ posterior samples as the finite feature approximation

\[\hat{f}(x) = \sum_{i=1}^m \phi_i(x)\theta_i\]

where \(\phi_i\) are m Fourier features and \(\theta_i\) are feature weights sampled from a posterior distribution that depends on the feature values at the model’s datapoints.

Our implementation follows [HernandezLHG14], with our calculations differing slightly depending on properties of the problem. In particular, we used different calculation strategies depending on the number of considered features m and the number of data points n.

If \(m<n\) then we follow Appendix A of [HernandezLHG14] and calculate the posterior distribution for \(\theta\) following their Bayesian linear regression motivation, i.e. the computation revolves around an O(m^3) inversion of a design matrix.

If \(n<m\) then we use the kernel trick to recast computation to revolve around an O(n^3) inversion of a gram matrix. As well as being more efficient in early BO steps (where \(n<m\)), this second computation method allows much larger choices of m (as required to approximate very flexible kernels).

Parameters:
  • model – The model to sample from.

  • num_features – The number of features used to approximate the kernel. We use a default of 1000 as it typically perfoms well for a wide range of kernels. Note that very smooth kernels (e.g. RBF) can be well-approximated with fewer features.

Raises:

ValueError – If dataset is empty.

_prepare_weight_sampler() Callable[[int], trieste.types.TensorType][source]#

Calculate the posterior of theta (the feature weights) for the RFFs, returning a function that takes in a batch size B and returns B samples for the weights of each of the RFF F features for one output.

_prepare_theta_posterior_in_design_space() tensorflow_probability.distributions.MultivariateNormalTriL[source]#

Calculate the posterior of theta (the feature weights) in the design space. This distribution is a Gaussian

\[\theta \sim N(D^{-1}\Phi^Ty,D^{-1}\sigma^2)\]

where the [m,m] design matrix \(D=(\Phi^T\Phi + \sigma^2I_m)\) is defined for the [n,m] matrix of feature evaluations across the training data \(\Phi\) and observation noise variance \(\sigma^2\).

_prepare_theta_posterior_in_gram_space() tensorflow_probability.distributions.MultivariateNormalTriL[source]#

Calculate the posterior of theta (the feature weights) in the gram space.

\[\theta \sim N(\Phi^TG^{-1}y,I_m - \Phi^TG^{-1}\Phi)\]

where the [n,n] gram matrix \(G=(\Phi\Phi^T + \sigma^2I_n)\) is defined for the [n,m] matrix of feature evaluations across the training data \(\Phi\) and observation noise variance \(\sigma^2\).

class DecoupledTrajectorySampler(model: FeatureDecompositionInducingPointModel | FeatureDecompositionInternalDataModel, num_features: int = 1000)[source]#

Bases: FeatureDecompositionTrajectorySampler[Union[FeatureDecompositionInducingPointModel, FeatureDecompositionInternalDataModel]]

This class builds functions that approximate a trajectory sampled from an underlying Gaussian process model using decoupled sampling. See [WBT+20] for an introduction to decoupled sampling.

Unlike our RandomFourierFeatureTrajectorySampler which uses a RFF decomposition to aprroximate the Gaussian process posterior, a DecoupledTrajectorySampler only uses an RFF decomposition to approximate the Gausian process prior and instead using a canonical decomposition to discretize the effect of updating the prior on the given data.

In particular, we approximate the Gaussian processes’ posterior samples as the finite feature approximation

\[\hat{f}(.) = \sum_{i=1}^L w_i\phi_i(.) + \sum_{j=1}^m v_jk(.,z_j)\]

where \(\phi_i(.)\) and \(w_i\) are the Fourier features and their weights that discretize the prior. In contrast, k(.,z_j) and \(v_i\) are the canonical features and their weights that discretize the data update.

The expression for \(v_i\) depends on if we are using an exact Gaussian process or a sparse approximations. See eq. (13) in [WBT+20] for details.

Note that if a model is both of FeatureDecompositionInducingPointModel type and FeatureDecompositionInternalDataModel type, FeatureDecompositionInducingPointModel will take a priority and inducing points will be used for computations rather than data.

Parameters:
  • model – The model to sample from.

  • num_features – The number of features used to approximate the kernel. We use a default of 1000 as it typically perfoms well for a wide range of kernels. Note that very smooth kernels (e.g. RBF) can be well-approximated with fewer features.

Raises:

NotImplementedError – If the model is not of valid type.

_prepare_weight_sampler() Callable[[int], trieste.types.TensorType][source]#

Prepare the sampler function that provides samples of the feature weights for both the RFF and canonical feature functions, i.e. we return a function that takes in a batch size B and returns B samples for the weights of each of the F RFF features and M canonical features for L outputs.

class ResampleableRandomFourierFeatureFunctions(model: FeatureDecompositionInducingPointModel | FeatureDecompositionInternalDataModel, n_components: int)[source]#

Bases: gpflux.layers.basis_functions.fourier_features.RandomFourierFeaturesCosine

A wrapper around GPFlux’s random Fourier feature function that allows for efficient in-place updating when generating new decompositions.

In particular, the bias and weights are stored as variables, which can then be updated by calling resample() without triggering expensive graph retracing.

Note that if a model is both of FeatureDecompositionInducingPointModel type and FeatureDecompositionInternalDataModel type, FeatureDecompositionInducingPointModel will take a priority and inducing points will be used for computations rather than data.

Parameters:
  • model – The model that will be approximed by these feature functions.

  • n_components – The desired number of features.

Raises:

NotImplementedError – If the model is not of valid type.

resample() None[source]#

Resample weights and biases

call(inputs: trieste.types.TensorType) trieste.types.TensorType[source]#

Evaluate the basis functions at inputs

class ResampleableDecoupledFeatureFunctions(model: FeatureDecompositionInducingPointModel | FeatureDecompositionInternalDataModel, n_components: int)[source]#

Bases: ResampleableRandomFourierFeatureFunctions

A wrapper around our ResampleableRandomFourierFeatureFunctions which rather than evaluates just F RFF functions instead evaluates the concatenation of F RFF functions with evaluations of the canonical basis functions.

Note that if a model is both of FeatureDecompositionInducingPointModel type and FeatureDecompositionInternalDataModel type, FeatureDecompositionInducingPointModel will take a priority and inducing points will be used for computations rather than data.

Parameters:
  • model – The model that will be approximed by these feature functions.

  • n_components – The desired number of features.

call(inputs: trieste.types.TensorType) trieste.types.TensorType[source]#

combine prior basis functions with canonical basis functions

class feature_decomposition_trajectory(feature_functions: Callable[[trieste.types.TensorType], trieste.types.TensorType], weight_sampler: Callable[[int], trieste.types.TensorType], mean_function: Callable[[trieste.types.TensorType], trieste.types.TensorType], encoder: trieste.space.EncoderFunction | None = None)[source]#

Bases: trieste.models.interfaces.TrajectoryFunctionClass

An approximate sample from a Gaussian processes’ posterior samples represented as a finite weighted sum of features.

A trajectory is given by

\[\hat{f}(x) = \sum_{i=1}^m \phi_i(x)\theta_i\]

where \(\phi_i\) are m feature functions and \(\theta_i\) are feature weights sampled from a posterior distribution.

The number of trajectories (i.e. batch size) is determined from the first call of the trajectory. In order to change the batch size, a new TrajectoryFunction must be built.

Parameters:
  • feature_functions – Set of feature function.

  • weight_sampler – New sampler that generates feature weight samples.

  • mean_function – The underlying model’s mean function.

  • encoder – Optional encoder with which to transform input points.

__call__(inputs: trieste.types.TensorType) trieste.types.TensorType[source]#

Call trajectory function.

resample() None[source]#

Efficiently resample in-place without retracing.

update(weight_sampler: Callable[[int], trieste.types.TensorType]) None[source]#

Efficiently update the trajectory with a new weight distribution and resample its weights.

Parameters:

weight_sampler – New sampler that generates feature weight samples.