gpflux.layers.gp_layer#

This module provides GPLayer, which implements a Sparse Variational Multioutput Gaussian Process as a Keras Layer.

Module Contents#

class GPLayer(kernel: gpflow.kernels.MultioutputKernel, inducing_variable: gpflow.inducing_variables.MultioutputInducingVariables, num_data: int, mean_function: gpflow.mean_functions.MeanFunction | None = None, *, num_samples: int | None = None, full_cov: bool = False, full_output_cov: bool = False, num_latent_gps: int = None, whiten: bool = True, name: str | None = None, verbose: bool = True)[source]#

Bases: tfp.layers.DistributionLambda

A sparse variational multioutput GP layer. This layer holds the kernel, inducing variables and variational distribution, and mean function.

Parameters:
  • kernel – The multioutput kernel for this layer.

  • inducing_variable – The inducing features for this layer.

  • num_data – The number of points in the training dataset (see num_data).

  • mean_function

    The mean function that will be applied to the inputs. Default: Identity.

    Note

    The Identity mean function requires the input and output dimensionality of this layer to be the same. If you want to change the dimensionality in a layer, you may want to provide a Linear mean function instead.

  • num_samples – The number of samples to draw when converting the DistributionLambda into a tf.Tensor, see _convert_to_tensor_fn(). Will be stored in the num_samples attribute. If None (the default), draw a single sample without prefixing the sample shape (see tfp.distributions.Distribution’s sample() method).

  • full_cov – Sets default behaviour of calling this layer (full_cov attribute): If False (the default), only predict marginals (diagonal of covariance) with respect to inputs. If True, predict full covariance over inputs.

  • full_output_cov – Sets default behaviour of calling this layer (full_output_cov attribute): If False (the default), only predict marginals (diagonal of covariance) with respect to outputs. If True, predict full covariance over outputs.

  • num_latent_gps – The number of (latent) GPs in the layer (which can be different from the number of outputs, e.g. with a LinearCoregionalization kernel). This is used to determine the size of the variational parameters q_mu and q_sqrt. If possible, it is inferred from the kernel and inducing_variable.

  • whiten – If True (the default), uses the whitened parameterisation of the inducing variables; see whiten.

  • name – The name of this layer.

  • verbose – The verbosity mode. Set this parameter to True to show debug information.

num_data: int[source]#

The number of points in the training dataset. This information is used to obtain the correct scaling between the data-fit and the KL term in the evidence lower bound (ELBO).

whiten: bool[source]#

This parameter determines the parameterisation of the inducing variables.

If True, this layer uses the whitened (or non-centred) representation, in which (at the example of inducing point inducing variables) u = f(Z) = cholesky(Kuu) v, and we parameterise an approximate posterior on v as q(v) = N(q_mu, q_sqrt q_sqrtᵀ). The prior on v is p(v) = N(0, I).

If False, this layer uses the non-whitened (or centred) representation, in which we directly parameterise q(u) = N(q_mu, q_sqrt q_sqrtᵀ). The prior on u is p(u) = N(0, Kuu).

num_samples: int | None[source]#

The number of samples drawn when coercing the output distribution of this layer to a tf.Tensor. (See _convert_to_tensor_fn().)

full_cov: bool[source]#

This parameter determines the behaviour of calling this layer. If False, only predict or sample marginals (diagonal of covariance) with respect to inputs. If True, predict or sample with the full covariance over the inputs.

full_output_cov: bool[source]#

This parameter determines the behaviour of calling this layer. If False, only predict or sample marginals (diagonal of covariance) with respect to outputs. If True, predict or sample with the full covariance over the outputs.

q_mu: gpflow.Parameter[source]#

The mean of q(v) or q(u) (depending on whether whitened parametrisation is used).

q_sqrt: gpflow.Parameter[source]#

The lower-triangular Cholesky factor of the covariance of q(v) or q(u) (depending on whether whitened parametrisation is used).

predict(inputs: gpflow.base.TensorType, *, full_cov: bool = False, full_output_cov: bool = False) Tuple[tf.Tensor, tf.Tensor][source]#

Make a prediction at N test inputs for the Q outputs of this layer, including the mean function contribution.

The covariance and its shape is determined by full_cov and full_output_cov as follows:

(co)variance shape

full_output_cov=False

full_output_cov=True

full_cov=False

[N, Q]

[N, Q, Q]

full_cov=True

[Q, N, N]

[N, Q, N, Q]

Parameters:
  • inputs – The inputs to predict at, with a shape of [N, D], where D is the input dimensionality of this layer.

  • full_cov – Whether to return full covariance (if True) or marginal variance (if False, the default) w.r.t. inputs.

  • full_output_cov – Whether to return full covariance (if True) or marginal variance (if False, the default) w.r.t. outputs.

Returns:

posterior mean (shape [N, Q]) and (co)variance (shape as above) at test points

call(inputs: gpflow.base.TensorType, *args: List[Any], **kwargs: Dict[str, Any]) tf.Tensor[source]#

The default behaviour upon calling this layer.

This method calls the tfp.layers.DistributionLambda super-class call method, which constructs a tfp.distributions.Distribution for the predictive distributions at the input points (see _make_distribution_fn()). You can pass this distribution to tf.convert_to_tensor, which will return samples from the distribution (see _convert_to_tensor_fn()).

This method also adds a layer-specific loss function, given by the KL divergence between this layer and the GP prior (scaled to per-datapoint).

prior_kl() tf.Tensor[source]#

Returns the KL divergence KL[q(u)∥p(u)] from the prior p(u) to the variational distribution q(u). If this layer uses the whitened representation, returns KL[q(v)∥p(v)].

_make_distribution_fn(previous_layer_outputs: gpflow.base.TensorType) tfp.distributions.Distribution[source]#

Construct the posterior distributions at the output points of the previous layer, depending on full_cov and full_output_cov.

Parameters:

previous_layer_outputs – The output from the previous layer, which should be coercible to a tf.Tensor

_convert_to_tensor_fn(distribution: tfp.distributions.Distribution) tf.Tensor[source]#

Convert the predictive distributions at the input points (see _make_distribution_fn()) to a tensor of num_samples samples from that distribution. Whether the samples are correlated or marginal (uncorrelated) depends on full_cov and full_output_cov.

sample() gpflux.sampling.sample.Sample[source]#

Todo

TODO: Document this.