markovflow.models.spatio_temporal_variational
Module containing a model for sparse spatio temporal variational inference
SparseSpatioTemporalKernel
Bases: markovflow.kernels.IndependentMultiOutput
markovflow.kernels.IndependentMultiOutput
A spatio-temporal kernel k(s,t) can be built from the product of a spatial kernel kₛ(s) and a Markovian temporal kernel kₜ(t), i.e. k(s,t) = kₛ(s) kₜ(t)
A GP f(.)∈ ℝ^m with kernel k(Z,.) [with space marginalized to locations Z] can be build as f(.) = chol(Kₛ(Z, Z)) @ [H s₁(.),…, H sₘ(.)],
where s₁(.),…,sₘ(.) are iid SDEs from the equivalent representation of markovian kernel kₜ(t)
kernel_space – spatial kernel
kernel_time – temporal kernel
inducing_space – spatial inducing points
generate_emission_model
Generate the emission matrix \(H\). This is the direct sum of the shared m child emission matrices H, pre-multiplied by the Cholesky factor of the spatial kernel evaluated at Zₛ.
chol(Kₛ(Zₛ, Zₛ)) @ [H,…, H]
time_points – The time points over which the emission model is defined, with shape batch_shape + [num_data].
batch_shape + [num_data]
The emission model associated with this kernel.
state_to_space_conditional_projection
Generates the matrix P, in the conditional mean E[f(x,t)|s(t)] = P s(t) It is given by combining E[f(x,t)|f(Zₛ)] = Kₛ(x, Zₛ)Kₛ(Zₛ, Zₛ)⁻¹f(Zₛ) E[f(Zₛ)|s(t)] = chol(Kₛ(Zₛ, Zₛ)) @ [H,…, H] s(t) leading to E[f(x,t)|s(t)] = Kₛ(x, Zₛ)Kₛ(Zₛ, Zₛ)⁻¹ chol(Kₛ(Zₛ, Zₛ)) @ [H,…, H] s(t)
= Kₛ(x, Zₛ) chol(Kₛ(Zₛ, Zₛ))⁻ᵀ @ [H,…, H] s(t)
inputs – Time point and associated spatial dimension to generate observations for, with shape batch_shape + [space_dim + 1, num_new_time_points].
batch_shape + [space_dim + 1, num_new_time_points]
The projection tensor with shape batch_shape + [num_new_time_points, obs_dim, state_dim].
batch_shape + [num_new_time_points, obs_dim, state_dim]
SpatioTemporalBase
Bases: markovflow.models.models.MarkovFlowSparseModel, abc.ABC
markovflow.models.models.MarkovFlowSparseModel
abc.ABC
Base class for Spatio-temporal GP regression using a factor kernel k_space_time((s,t),(s’,t’)) = k_time(t,t’) * k_space(s,s’)
where k_time is a Markovian kernel.
inducing_space – inducing space points [Ms, D]
kernel_space – Gpflow space kernel
kernel_time – Markovflow time kernel
likelihood – a likelihood object
mean_function – The mean function for the GP. Defaults to no mean function.
space_time_predict_f
Predict marginal function values at inputs. Note the time points should be sorted.
inputs
inputs –
Time point and associated spatial dimension to generate observations for, with shape
batch_shape + [space_dim + 1, num_new_time_points].
Predicted mean and covariance for the new time points, with respective shapes batch_shape + [num_new_time_points, output_dim] and either batch_shape + [num_new_time_points, output_dim, output_dim] or batch_shape + [num_new_time_points, output_dim].
batch_shape + [num_new_time_points, output_dim]
batch_shape + [num_new_time_points, output_dim, output_dim]
loss
Return the loss, which is the negative evidence lower bound (ELBO).
input_data – A tuple of space-time points and observations containing the data at which to calculate the loss for training the model.
posterior
Posterior
dist_q
Posterior state space model on inducing states
dist_p
Prior state space model on inducing states
elbo
Calculates the evidence lower bound (ELBO) log p(y)
input_data – A tuple of space-time points and observations containing data at which to calculate the loss for training the model.
A scalar tensor (summed over the batch_shape dimension) representing the ELBO.
predict_log_density
Compute the log density of the data at the new data points.
kernel
Return the kernel of the GP.
inducing_time
Return the temporal inducing inputs of the model.
inducing_space
Return the spatial inducing inputs of the model.
SpatioTemporalSparseVariational
Bases: SpatioTemporalBase
Model for Variational Spatio-temporal GP regression using a factor kernel k_space_time((s,t),(s’,t’)) = k_time(t,t’) * k_space(s,s’)
The following notation is used: * X=(x,t) - the space-time points of the training data. * zₛ - the space inducing/pseudo points. * zₜ - the time inducing/pseudo points. * y - observations corresponding to points X. * f(.,.) the spatio-temporal process * x(.,.) the SSM formulation of the spatio-temporal process * u(.) = x(zₛ,.) - the spatio-temporal SSM marginalized at zₛ * p(y | f) - the likelihood * p(.) the prior distribution * q(.) the variational distribution
This can be seen as the temporal extension of gpflow.SVGP, where instead of fixed inducing variables u, they are now time dependent u(t) and follow a Markov chain.
for a fixed set of spatial inducing inputs zₛ p(x(zₛ, .)) is a continuous time process of state dimension Mₛd for a fixed time slice t, p(x(.,t)) ~ GP(0, kₛ)
The following conditional independence holds: p(x(s,t) | x(zₛ, .)) = p(x(s,t) | s(zₛ, t)), i.e., prediction at a new point at time t given x(zₛ, .) only depends on s(zₛ, t)
This builds a spatially sparse process as q(x(.,.)) = q(x(zₛ, .)) p(x(.,.) |x(zₛ, .)), where the multi-output temporal process q(x(zₛ, .)) is also sparse q(x(zₛ, .)) = q(x(zₛ, zₜ)) p(x(zₛ,.) |x(zₛ, zₜ))
the marginal q(x(zₛ, zₜ)) is a multivariate Gaussian distribution parameterized as a state space model.
inducing_time – inducing time points [Mt,]
num_data – number of observations
Posterior process
SpatioTemporalSparseCVI
Model for Spatio-temporal GP regression using a factor kernel k_space_time((s,t),(s’,t’)) = k_time(t,t’) * k_space(s,s’)
This can be seen as the spatial extension of markovflow’s SparseCVIGaussianProcess for temporal (only) Gaussian Processes. The inducing variables u(x,t) are now space and time dependent.
for a fixed set of space points zₛ p(x(zₛ, .)) is a continuous time process of state dimension Mₛd for a fixed time slice t, p(x(.,t)) ~ GP(0, kₛ)
the marginal q(x(zₛ, zₜ)) is parameterized as the product q(x(zₛ, zₜ)) = p(x(zₛ, zₜ)) t(x(zₛ, zₜ)) where p(x(zₛ, zₜ)) is a state space model and t(x(zₛ, zₜ)) are sites.
num_data – The total number of observations. (relevant when feeding in external minibatches).
learning_rate – the learning rate.
Posterior object to predict outside of the training time points
Computes the variational posterior distribution on the vector of inducing states
Computes the prior distribution on the vector of inducing states
projection_inducing_states_to_observations
Compute the projection matrix from of the conditional mean of f(x,t) | s(t) :param input_data: Time point and associated spatial dimension to generate observations for,
with shape batch_shape + [space_dim + 1, num_time_points].
batch_shape + [space_dim + 1, num_time_points]
The projection matrix with shape [num_time_points, obs_dim, num_inducing_time x state_dim ]
update_sites
𝜽ₘ ← ρ𝜽ₘ + (1-ρ)𝐠ₘ
Here 𝐠ₘ are the sum of the gradient of the variational expectation for each data point indexed k, projected back to the site vₘ = [uₘ, uₘ₊₁], through the conditional p(fₖ|vₘ) :param input_data: A tuple of time points and observations
local_objective_and_gradients
Returs the local_objective and its gradients wrt to the expectation parameters :param Fmu: means μ […, latent_dim] :param Fvar: variances σ² […, latent_dim] :param Y: observations Y […, observation_dim] :return: local objective and gradient wrt [μ, σ² + μ²]
local_objective
local loss in CVI :param Fmu: means […, latent_dim] :param Fvar: variances […, latent_dim] :param Y: observations […, observation_dim] :return: local objective […]