gpflux.layers.bayesian_dense_layer#

This module provides BayesianDenseLayer, which implements a variational Bayesian dense (fully-connected) neural network layer as a Keras Layer.

Module Contents#

class BayesianDenseLayer(input_dim: int, output_dim: int, num_data: int, w_mu: numpy.ndarray | None = None, w_sqrt: numpy.ndarray | None = None, activation: Callable | None = None, is_mean_field: bool = True, temperature: float = 0.0001)[source]#

Bases: gpflux.layers.trackable_layer.TrackableLayer

A dense (fully-connected) layer for variational Bayesian neural networks.

This layer holds the mean and square-root of the variance of the distribution over the weights. This layer also has a temperature for cooling (or heating) the posterior.

Parameters:
  • input_dim – The input dimension (excluding bias) of this layer.

  • output_dim – The output dimension of this layer.

  • num_data – The number of points in the training dataset (used for scaling the KL regulariser).

  • w_mu – Initial value of the variational mean for weights + bias. If not specified, this defaults to xavier_initialization_numpy for the weights and zero for the bias.

  • w_sqrt – Initial value of the variational Cholesky of the (co)variance for weights + bias. If not specified, this defaults to 1e-5 * Identity.

  • activation – The activation function. If not specified, this defaults to the identity.

  • is_mean_field – Determines whether the approximation to the weight posterior is mean field. Must be consistent with the shape of w_sqrt, if specified.

  • temperature – The KL loss will be scaled by this factor. Can be used for cooling (< 1.0) or heating (> 1.0) the posterior. As suggested in “How Good is the Bayes Posterior in Deep Neural Networks Really?” by Wenzel et al. (2020) the default value is a cold 1e-4.

build(input_shape: gpflux.types.ShapeType) None[source]#

Build the variables necessary on first call

predict_samples(inputs: gpflow.base.TensorType, *, num_samples: int | None = None) tf.Tensor[source]#

Samples from the approximate posterior at N test inputs, with input_dim = D, output_dim = Q.

Parameters:
  • inputs – The inputs to predict at; shape [N, D].

  • num_samples – The number of samples S, to draw.

Returns:

Samples, shape [S, N, Q] if S is not None else [N, Q].

call(inputs: gpflow.base.TensorType, training: bool | None = False) tf.Tensor | gpflow.models.model.MeanAndVariance[source]#

The default behaviour upon calling this layer.

prior_kl() tf.Tensor[source]#

Returns the KL divergence KL[q(u)∥p(u)] from the prior p(u) = N(0, I) to the variational distribution q(u) = N(w_mu, w_sqrt²).