markovflow.posterior
Module containing posterior processes for GP models.
PosteriorProcess
Bases: tf.Module, abc.ABC
tf.Module
abc.ABC
Abstract class for forming a posterior process.
Posteriors that extend this class must implement the sample_state_trajectories(), sample_f(), predict_state() and predict_f() methods.
sample_state_trajectories()
sample_f()
predict_state()
predict_f()
sample_state
Generate joint state samples at new_time_points.
new_time_points
new_time_points – Time points to generate sample trajectories for, with shape batch_shape + [num_time_points].
batch_shape + [num_time_points]
sample_shape – A SampleShape that is the shape (or number of) sampled trajectories to draw.
SampleShape
input_data –
A tuple of time points and observations containing the data:
A tensor of inputs with shape batch_shape + [num_data]
batch_shape + [num_data]
A tensor of observations with shape batch_shape + [num_data, observation_dim]
batch_shape + [num_data, observation_dim]
This is an optional argument only passed in for inference with an importance-weighted posterior.
A tensor containing:
Sampled trajectories at new points, with shape sample_shape + batch_shape + [num_time_points, state_dim]
sample_shape + batch_shape + [num_time_points, state_dim]
Sampled trajectories at conditioning points, with shape sample_shape + batch_shape + [num_conditioning_points, state_dim]
sample_shape + batch_shape + [num_conditioning_points, state_dim]
sample_state_trajectories
Generate joint sampled state trajectories evaluated both at new_time_points and at some points that we condition on for obtaining the posterior.
NotImplementedError – Must be implemented in derived classes.
sample_f
Generate joint function evaluation samples (projected states) at new_time_points.
A tensor containing sampled trajectories, with shape sample_shape + batch_shape + [num_time_points, output_dim].
sample_shape + batch_shape + [num_time_points, output_dim]
predict_state
Predict state at new_time_points. Note these time points should be sorted.
new_time_points – Time points to generate observations for, with shape batch_shape + [num_new_time_points,].
batch_shape + [num_new_time_points,]
predict_f
Predict marginal function values at new_time_points. Note these time points should be sorted.
new_time_points – Time points to generate observations for, with shape batch_shape + [num_new_time_points].
batch_shape + [num_new_time_points]
full_output_cov – Either full output covariance (True) or marginal variances (False).
True
False
ConditionalProcess
Bases: PosteriorProcess
Represents a posterior process indexed on the real line.
This means \(q(s(.))\) is built by combining the marginals \(q(s(Z))\) and the conditional process \(p(s(.)|s(Z))\) into:
The marginals at discrete time inputs are available in closed form (see the predict_f() method).
It also includes methods for sampling from the posterior process.
posterior_dist – The posterior represented by a Gauss-Markov distribution used for inference. For variational models this is the model defined by the variational distribution.
kernel – The kernel of the prior process.
conditioning_time_points – The time points to condition on for inference, with shape batch_shape + [num_time_points].
mean_function – The mean function of the process that is added to fs.
Predicted mean and covariance for the new time points, with respective shapes batch_shape + [num_new_time_points, state_dim] batch_shape + [num_new_time_points, state_dim, state_dim].
batch_shape + [num_new_time_points, state_dim]
batch_shape + [num_new_time_points, state_dim, state_dim]
Note
new_time_points that are far outside the self.conditioning_time_points specified when instantiating the class will revert to the prior.
self.conditioning_time_points
Predicted mean and covariance for the new time points, with respective shapes batch_shape + [num_new_time_points, output_dim] and either batch_shape + [num_new_time_points, output_dim, output_dim] or batch_shape + [num_new_time_points, output_dim].
batch_shape + [num_new_time_points, output_dim]
batch_shape + [num_new_time_points, output_dim, output_dim]
Generate joint state samples at new_time_points and the self.conditioning_time_points specified when instantiating the class.
See Appendix 2 of “Doubly Sparse Variational Gaussian Processes” for a derivation.
The following notation is used:
\(t\) - a vector of new time points \(z\) - a vector of the conditioning time points \(sₚ/uₚ\) - prior state sample at \(t/z\) \(sₒ/Uₒ\) - posterior state sample at \(t/z\) \(p(.)\) - the prior \(q(.)\) - the posterior
\(t\) - a vector of new time points
\(z\) - a vector of the conditioning time points
\(sₚ/uₚ\) - prior state sample at \(t/z\)
\(sₒ/Uₒ\) - posterior state sample at \(t/z\)
\(p(.)\) - the prior
\(q(.)\) - the posterior
Jointly sample from the prior at new and conditioning points:
And sample from the posterior at the conditioning points:
A sample from the posterior state is given by:
Noting \(z₋,z₊\), for each new point \(tₖ\) the points in \(z\) closest to \(tₖ\) and \(vₖ = [s(z₋),s(z₊)]\) are:
That is, the conditional mean is local; it only depends on the nearing conditioning states.
sample_shape – A SampleShape that is the shape of sampled trajectories to draw. This can be either an integer or a tuple/list of integers.
Note this argument will be ignored if your posterior is an AnalyticPosteriorProcess.
AnalyticPosteriorProcess
A tensor containing the sampled trajectories, with shape sample_shape + batch_shape + [num_time_points, output_dim].
Bases: ConditionalProcess
Represents the (approximate) posterior process of a GP model.
It inherits the marginal prediction and sampling methods from the parent ConditionalProcess class.
It also includes a method to predict the observations (see predict_y()).
predict_y()
likelihood – Likelihood defining how to project from f-space to an observation.
predict_y
Predict observation marginals at new_time_points. Note these time points should be sorted.
ImportanceWeightedPosteriorProcess
Represents the approximate posterior process of a GP model.
The approximate posterior process is inferred via importance-weighted variational inference.
num_importance_samples – The number of importance-weighted samples.
proposal_dist – The proposal represented by a Gauss-Markov distribution, from which we draw samples. This is the model defined by the variational distribution.
conditioning_time_points – Time points to condition on for inference, with shape batch_shape + [num_time_points].
_log_qu_density
Log density of the posterior process evaluated at the conditioning points.
samples_u – State samples at the conditioning time points, with shape sample_shape + [num_conditioning_points, state_dim].
sample_shape + [num_conditioning_points, state_dim]
stop_gradient – Whether to stop the gradient flow through the samples. It is useful to do so when optimising the proposal distribution with control variates for reduced variance.
log q(u) [num_samples]
log_importance_weights
Compute the log-importance weights for some state samples.
The importance weights are given by:
Because it is assumed that \(q(s | u) = p(s | u)\), the weights reduce to:
We evaluate this ratio for some tensors of samples_s and samples_u, which are assumed to have been drawn from \(q(s, u)\). To do this, samples_s are projected to \(f\) before being passed to the likelihood object.
samples_s
samples_u
samples_s – A tensor of samples drawn from \(p(s|u)\), with shape sample_shape + batch_shape + [num_data, state_dim].
sample_shape + batch_shape + [num_data, state_dim]
samples_u – A tensor of samples drawn from \(q(u)\), with shape sample_shape + batch_shape + [num_inducing, state_dim].
sample_shape + batch_shape + [num_inducing, state_dim]
stop_gradient – Whether to call stop gradient on \(q(u)\). This is useful for control variate schemes.
A tensor with shape [sample_shape].
[sample_shape]
_iwvi_samples_and_weights
Sample from q(states) indexed by new_time_points and compute the log weights associated.
new_time_points – ordered time input where to sample with shape batch_shape + [num_new_time_points]
sample_shape – A SampleShape that specifies how many samples to draw, with shape (..., num_importance_samples).
(..., num_importance_samples)
state samples from the posterior and the log-weights with shapes: sample_shape + batch_shape + [num_new_time_points, state_dim] sample_shape sample_shape + batch_shape + [num_conditioning_points, state_dim]
Sample the importance-weighted posterior over states.
new_time_points – Ordered time input from which to sample, with shape batch_shape + [num_new_time_points].
The ordered samples states, with shape sample_shape + batch_shape + [num_new_time_points, state_dim].
sample_shape + batch_shape + [num_new_time_points, state_dim]
Sample the importance-weighted (IWVI) posterior over functions.
Note that to compute the expected value of some function under the iwvi posterior, it is likely to be more efficient to use expected_value().
expected_value()
The ordered samples on latent functions, with shape [num_samples] + batch_shape + [num_new_time_points, num_outputs].
[num_samples] + batch_shape + [num_new_time_points, num_outputs]
expected_value
Compute the expected value of the function func acting on a random variable \(f\).
func
\(f\) is represented by a GP in this case, using importance sampling at the times given in new_time_points. That is:
…where:
\(qₚ\) is the importance-weighted approximate posterior distribution of \(f\) \(wₖ\) are the importance weights
\(qₚ\) is the importance-weighted approximate posterior distribution of \(f\)
\(wₖ\) are the importance weights
For example, to compute the posterior mean we set func = tf.identify.
func = tf.identify
func – The function to compute the expected value of. func should act on the last dimension of a tensor. That last dimension will have length as specified by the output_dim of the underlying emission model. The return shape of func need not be the same, but we expect all other dimensions to broadcast.
A tensor with shape batch_shape + [num_new_time_points, output_dim].
Not applicable to ImportanceWeightedPosteriorProcess. The marginal state predictions are not available in closed form.
Not applicable to ImportanceWeightedPosteriorProcess. The marginal function predictions are not available in closed form.
_correct_mean_shape
Helper function that checks if the state space model is defined over a StackKernel so that it can bring the output of the mean function to the right shape. In any other case, the mean is returned unaltered.
StackKernel
mean –
MeanFunction
batch_shape + [num_data, output_dim]
where batch_shape[-1] = output_dim
kernel – the corresponding kernel of the GaussMarkovDistribution
GaussMarkovDistribution
the mean value with the correct shape which is batch_shape + [num_data, output_dim] or batch_shape[:-1] + [num_data, output_dim] in the case of a StackKernel as the last dimension of the batch_shape is the output_dim.
batch_shape