gpflux.layers.gp_layer
#
This module provides GPLayer
, which implements a Sparse Variational
Multioutput Gaussian Process as a Keras Layer
.
Module Contents#
- class GPLayer(kernel: gpflow.kernels.MultioutputKernel, inducing_variable: gpflow.inducing_variables.MultioutputInducingVariables, num_data: int, mean_function: gpflow.mean_functions.MeanFunction | None = None, *, num_samples: int | None = None, full_cov: bool = False, full_output_cov: bool = False, num_latent_gps: int = None, whiten: bool = True, name: str | None = None, verbose: bool = True)[source]#
Bases:
tfp.layers.DistributionLambda
A sparse variational multioutput GP layer. This layer holds the kernel, inducing variables and variational distribution, and mean function.
- Parameters:
kernel – The multioutput kernel for this layer.
inducing_variable – The inducing features for this layer.
num_data – The number of points in the training dataset (see
num_data
).mean_function –
The mean function that will be applied to the inputs. Default:
Identity
.Note
The Identity mean function requires the input and output dimensionality of this layer to be the same. If you want to change the dimensionality in a layer, you may want to provide a
Linear
mean function instead.num_samples – The number of samples to draw when converting the
DistributionLambda
into atf.Tensor
, see_convert_to_tensor_fn()
. Will be stored in thenum_samples
attribute. IfNone
(the default), draw a single sample without prefixing the sample shape (seetfp.distributions.Distribution
’s sample() method).full_cov – Sets default behaviour of calling this layer (
full_cov
attribute): IfFalse
(the default), only predict marginals (diagonal of covariance) with respect to inputs. IfTrue
, predict full covariance over inputs.full_output_cov – Sets default behaviour of calling this layer (
full_output_cov
attribute): IfFalse
(the default), only predict marginals (diagonal of covariance) with respect to outputs. IfTrue
, predict full covariance over outputs.num_latent_gps – The number of (latent) GPs in the layer (which can be different from the number of outputs, e.g. with a
LinearCoregionalization
kernel). This is used to determine the size of the variational parametersq_mu
andq_sqrt
. If possible, it is inferred from the kernel and inducing_variable.whiten – If
True
(the default), uses the whitened parameterisation of the inducing variables; seewhiten
.name – The name of this layer.
verbose – The verbosity mode. Set this parameter to
True
to show debug information.
- num_data: int[source]#
The number of points in the training dataset. This information is used to obtain the correct scaling between the data-fit and the KL term in the evidence lower bound (ELBO).
- whiten: bool[source]#
This parameter determines the parameterisation of the inducing variables.
If
True
, this layer uses the whitened (or non-centred) representation, in which (at the example of inducing point inducing variables)u = f(Z) = cholesky(Kuu) v
, and we parameterise an approximate posterior onv
asq(v) = N(q_mu, q_sqrt q_sqrtᵀ)
. The prior onv
isp(v) = N(0, I)
.If
False
, this layer uses the non-whitened (or centred) representation, in which we directly parameteriseq(u) = N(q_mu, q_sqrt q_sqrtᵀ)
. The prior onu
isp(u) = N(0, Kuu)
.
- num_samples: int | None[source]#
The number of samples drawn when coercing the output distribution of this layer to a
tf.Tensor
. (See_convert_to_tensor_fn()
.)
- full_cov: bool[source]#
This parameter determines the behaviour of calling this layer. If
False
, only predict or sample marginals (diagonal of covariance) with respect to inputs. IfTrue
, predict or sample with the full covariance over the inputs.
- full_output_cov: bool[source]#
This parameter determines the behaviour of calling this layer. If
False
, only predict or sample marginals (diagonal of covariance) with respect to outputs. IfTrue
, predict or sample with the full covariance over the outputs.
- q_mu: gpflow.Parameter[source]#
The mean of
q(v)
orq(u)
(depending on whetherwhiten
ed parametrisation is used).
- q_sqrt: gpflow.Parameter[source]#
The lower-triangular Cholesky factor of the covariance of
q(v)
orq(u)
(depending on whetherwhiten
ed parametrisation is used).
- predict(inputs: gpflow.base.TensorType, *, full_cov: bool = False, full_output_cov: bool = False) Tuple[tf.Tensor, tf.Tensor] [source]#
Make a prediction at N test inputs for the Q outputs of this layer, including the mean function contribution.
The covariance and its shape is determined by full_cov and full_output_cov as follows:
(co)variance shape
full_output_cov=False
full_output_cov=True
full_cov=False
[N, Q]
[N, Q, Q]
full_cov=True
[Q, N, N]
[N, Q, N, Q]
- Parameters:
inputs – The inputs to predict at, with a shape of [N, D], where D is the input dimensionality of this layer.
full_cov – Whether to return full covariance (if
True
) or marginal variance (ifFalse
, the default) w.r.t. inputs.full_output_cov – Whether to return full covariance (if
True
) or marginal variance (ifFalse
, the default) w.r.t. outputs.
- Returns:
posterior mean (shape [N, Q]) and (co)variance (shape as above) at test points
- call(inputs: gpflow.base.TensorType, *args: List[Any], **kwargs: Dict[str, Any]) tf.Tensor [source]#
The default behaviour upon calling this layer.
This method calls the
tfp.layers.DistributionLambda
super-classcall
method, which constructs atfp.distributions.Distribution
for the predictive distributions at the input points (see_make_distribution_fn()
). You can pass this distribution totf.convert_to_tensor
, which will return samples from the distribution (see_convert_to_tensor_fn()
).This method also adds a layer-specific loss function, given by the KL divergence between this layer and the GP prior (scaled to per-datapoint).
- prior_kl() tf.Tensor [source]#
Returns the KL divergence
KL[q(u)∥p(u)]
from the priorp(u)
to the variational distributionq(u)
. If this layer uses thewhiten
ed representation, returnsKL[q(v)∥p(v)]
.
- _make_distribution_fn(previous_layer_outputs: gpflow.base.TensorType) tfp.distributions.Distribution [source]#
Construct the posterior distributions at the output points of the previous layer, depending on
full_cov
andfull_output_cov
.- Parameters:
previous_layer_outputs – The output from the previous layer, which should be coercible to a
tf.Tensor
- _convert_to_tensor_fn(distribution: tfp.distributions.Distribution) tf.Tensor [source]#
Convert the predictive distributions at the input points (see
_make_distribution_fn()
) to a tensor ofnum_samples
samples from that distribution. Whether the samples are correlated or marginal (uncorrelated) depends onfull_cov
andfull_output_cov
.
- sample() gpflux.sampling.sample.Sample [source]#
Todo
TODO: Document this.