markovflow.models.spatio_temporal_variational

Module containing a model for sparse spatio temporal variational inference

Module Contents

class SparseSpatioTemporalKernel(kernel_space: gpflow.kernels.Kernel, kernel_time: markovflow.kernels.SDEKernel, inducing_space)[source]

Bases: markovflow.kernels.IndependentMultiOutput

A spatio-temporal kernel k(s,t) can be built from the product of a spatial kernel kₛ(s) and a Markovian temporal kernel kₜ(t), i.e. k(s,t) = kₛ(s) kₜ(t)

A GP f(.)∈ ℝ^m with kernel k(Z,.) [with space marginalized to locations Z] can be build as f(.) = chol(Kₛ(Z, Z)) @ [H s₁(.),…, H sₘ(.)],

where s₁(.),…,sₘ(.) are iid SDEs from the equivalent representation of markovian kernel kₜ(t)

Parameters
  • kernel_space – spatial kernel

  • kernel_time – temporal kernel

  • inducing_space – spatial inducing points

generate_emission_model(time_points: tf.Tensor)markovflow.emission_model.EmissionModel[source]

Generate the emission matrix \(H\). This is the direct sum of the shared m child emission matrices H, pre-multiplied by the Cholesky factor of the spatial kernel evaluated at Zₛ.

chol(Kₛ(Zₛ, Zₛ)) @ [H,…, H]

Parameters

time_points – The time points over which the emission model is defined, with shape batch_shape + [num_data].

Returns

The emission model associated with this kernel.

state_to_space_conditional_projection(inputs)[source]

Generates the matrix P, in the conditional mean E[f(x,t)|s(t)] = P s(t) It is given by combining E[f(x,t)|f(Zₛ)] = Kₛ(x, Zₛ)Kₛ(Zₛ, Zₛ)⁻¹f(Zₛ) E[f(Zₛ)|s(t)] = chol(Kₛ(Zₛ, Zₛ)) @ [H,…, H] s(t) leading to E[f(x,t)|s(t)] = Kₛ(x, Zₛ)Kₛ(Zₛ, Zₛ)⁻¹ chol(Kₛ(Zₛ, Zₛ)) @ [H,…, H] s(t)

= Kₛ(x, Zₛ) chol(Kₛ(Zₛ, Zₛ))⁻ᵀ @ [H,…, H] s(t)

Parameters

inputs – Time point and associated spatial dimension to generate observations for, with shape batch_shape + [space_dim + 1, num_new_time_points].

Returns

The projection tensor with shape batch_shape + [num_new_time_points, obs_dim, state_dim].

class SpatioTemporalBase(inducing_space, kernel_space: gpflow.kernels.Kernel, kernel_time: markovflow.kernels.SDEKernel, likelihood: gpflow.likelihoods.Likelihood, mean_function: Optional[markovflow.mean_function.MeanFunction] = None)[source]

Bases: markovflow.models.models.MarkovFlowSparseModel, abc.ABC

Base class for Spatio-temporal GP regression using a factor kernel k_space_time((s,t),(s’,t’)) = k_time(t,t’) * k_space(s,s’)

where k_time is a Markovian kernel.

Parameters
  • inducing_space – inducing space points [Ms, D]

  • kernel_space – Gpflow space kernel

  • kernel_time – Markovflow time kernel

  • likelihood – a likelihood object

  • mean_function – The mean function for the GP. Defaults to no mean function.

space_time_predict_f(inputs)[source]

Predict marginal function values at inputs. Note the time points should be sorted.

Parameters

inputs

Time point and associated spatial dimension to generate observations for, with shape

batch_shape + [space_dim + 1, num_new_time_points].

Returns

Predicted mean and covariance for the new time points, with respective shapes batch_shape + [num_new_time_points, output_dim] and either batch_shape + [num_new_time_points, output_dim, output_dim] or batch_shape + [num_new_time_points, output_dim].

loss(input_data: Tuple[tf.Tensor, tf.Tensor])tf.Tensor[source]

Return the loss, which is the negative evidence lower bound (ELBO).

Parameters

input_data – A tuple of space-time points and observations containing the data at which to calculate the loss for training the model.

property posteriormarkovflow.posterior.PosteriorProcess[source]

Posterior

property dist_qmarkovflow.state_space_model.StateSpaceModel[source]

Posterior state space model on inducing states

property dist_pmarkovflow.state_space_model.StateSpaceModel[source]

Prior state space model on inducing states

elbo(input_data: Tuple[tf.Tensor, tf.Tensor])tf.Tensor[source]

Calculates the evidence lower bound (ELBO) log p(y)

Parameters

input_data – A tuple of space-time points and observations containing data at which to calculate the loss for training the model.

Returns

A scalar tensor (summed over the batch_shape dimension) representing the ELBO.

predict_log_density(input_data: Tuple[tf.Tensor, tf.Tensor], full_output_cov: bool = False)tf.Tensor[source]

Compute the log density of the data at the new data points.

property kernelmarkovflow.kernels.SDEKernel[source]

Return the kernel of the GP.

property inducing_timetf.Tensor[source]

Return the temporal inducing inputs of the model.

property inducing_spacetf.Tensor[source]

Return the spatial inducing inputs of the model.

class SpatioTemporalSparseVariational(inducing_space, inducing_time, kernel_space: gpflow.kernels.Kernel, kernel_time: markovflow.kernels.SDEKernel, likelihood: gpflow.likelihoods.Likelihood, mean_function: Optional[markovflow.mean_function.MeanFunction] = None, num_data=None)[source]

Bases: SpatioTemporalBase

Model for Variational Spatio-temporal GP regression using a factor kernel k_space_time((s,t),(s’,t’)) = k_time(t,t’) * k_space(s,s’)

where k_time is a Markovian kernel.

The following notation is used: * X=(x,t) - the space-time points of the training data. * zₛ - the space inducing/pseudo points. * zₜ - the time inducing/pseudo points. * y - observations corresponding to points X. * f(.,.) the spatio-temporal process * x(.,.) the SSM formulation of the spatio-temporal process * u(.) = x(zₛ,.) - the spatio-temporal SSM marginalized at zₛ * p(y | f) - the likelihood * p(.) the prior distribution * q(.) the variational distribution

This can be seen as the temporal extension of gpflow.SVGP, where instead of fixed inducing variables u, they are now time dependent u(t) and follow a Markov chain.

for a fixed set of spatial inducing inputs zₛ p(x(zₛ, .)) is a continuous time process of state dimension Mₛd for a fixed time slice t, p(x(.,t)) ~ GP(0, kₛ)

The following conditional independence holds: p(x(s,t) | x(zₛ, .)) = p(x(s,t) | s(zₛ, t)), i.e., prediction at a new point at time t given x(zₛ, .) only depends on s(zₛ, t)

This builds a spatially sparse process as q(x(.,.)) = q(x(zₛ, .)) p(x(.,.) |x(zₛ, .)), where the multi-output temporal process q(x(zₛ, .)) is also sparse q(x(zₛ, .)) = q(x(zₛ, zₜ)) p(x(zₛ,.) |x(zₛ, zₜ))

the marginal q(x(zₛ, zₜ)) is a multivariate Gaussian distribution parameterized as a state space model.

Parameters
  • inducing_space – inducing space points [Ms, D]

  • inducing_time – inducing time points [Mt,]

  • kernel_space – Gpflow space kernel

  • kernel_time – Markovflow time kernel

  • likelihood – a likelihood object

  • mean_function – The mean function for the GP. Defaults to no mean function.

  • num_data – number of observations

property dist_qmarkovflow.state_space_model.StateSpaceModel[source]

Posterior state space model on inducing states

property dist_pmarkovflow.state_space_model.StateSpaceModel[source]

Prior state space model on inducing states

property posteriormarkovflow.posterior.PosteriorProcess[source]

Posterior process

class SpatioTemporalSparseCVI(inducing_space, inducing_time, kernel_space: gpflow.kernels.Kernel, kernel_time: markovflow.kernels.SDEKernel, likelihood: gpflow.likelihoods.Likelihood, mean_function: Optional[markovflow.mean_function.MeanFunction] = None, num_data=None, learning_rate=0.1)[source]

Bases: SpatioTemporalBase

Model for Spatio-temporal GP regression using a factor kernel k_space_time((s,t),(s’,t’)) = k_time(t,t’) * k_space(s,s’)

where k_time is a Markovian kernel.

The following notation is used: * X=(x,t) - the space-time points of the training data. * zₛ - the space inducing/pseudo points. * zₜ - the time inducing/pseudo points. * y - observations corresponding to points X. * f(.,.) the spatio-temporal process * x(.,.) the SSM formulation of the spatio-temporal process * u(.) = x(zₛ,.) - the spatio-temporal SSM marginalized at zₛ * p(y | f) - the likelihood * p(.) the prior distribution * q(.) the variational distribution

This can be seen as the spatial extension of markovflow’s SparseCVIGaussianProcess for temporal (only) Gaussian Processes. The inducing variables u(x,t) are now space and time dependent.

for a fixed set of space points zₛ p(x(zₛ, .)) is a continuous time process of state dimension Mₛd for a fixed time slice t, p(x(.,t)) ~ GP(0, kₛ)

The following conditional independence holds: p(x(s,t) | x(zₛ, .)) = p(x(s,t) | s(zₛ, t)), i.e., prediction at a new point at time t given x(zₛ, .) only depends on s(zₛ, t)

This builds a spatially sparse process as q(x(.,.)) = q(x(zₛ, .)) p(x(.,.) |x(zₛ, .)), where the multi-output temporal process q(x(zₛ, .)) is also sparse q(x(zₛ, .)) = q(x(zₛ, zₜ)) p(x(zₛ,.) |x(zₛ, zₜ))

the marginal q(x(zₛ, zₜ)) is parameterized as the product q(x(zₛ, zₜ)) = p(x(zₛ, zₜ)) t(x(zₛ, zₜ)) where p(x(zₛ, zₜ)) is a state space model and t(x(zₛ, zₜ)) are sites.

Parameters
  • inducing_space – inducing space points [Ms, D]

  • inducing_time – inducing time points [Mt,]

  • kernel_space – Gpflow space kernel

  • kernel_time – Markovflow time kernel

  • likelihood – a likelihood object

  • mean_function – The mean function for the GP. Defaults to no mean function.

  • num_data – The total number of observations. (relevant when feeding in external minibatches).

  • learning_rate – the learning rate.

property posteriormarkovflow.posterior.PosteriorProcess[source]

Posterior object to predict outside of the training time points

property dist_qmarkovflow.state_space_model.StateSpaceModel[source]

Computes the variational posterior distribution on the vector of inducing states

property dist_pmarkovflow.state_space_model.StateSpaceModel[source]

Computes the prior distribution on the vector of inducing states

projection_inducing_states_to_observations(input_data: tf.Tensor)tf.Tensor[source]

Compute the projection matrix from of the conditional mean of f(x,t) | s(t) :param input_data: Time point and associated spatial dimension to generate observations for,

with shape batch_shape + [space_dim + 1, num_time_points].

Returns

The projection matrix with shape [num_time_points, obs_dim, num_inducing_time x state_dim ]

update_sites(input_data: Tuple[tf.Tensor, tf.Tensor])None[source]
Perform one joint update of the Gaussian sites

𝜽ₘ ← ρ𝜽ₘ + (1-ρ)𝐠ₘ

Here 𝐠ₘ are the sum of the gradient of the variational expectation for each data point indexed k, projected back to the site vₘ = [uₘ, uₘ₊₁], through the conditional p(fₖ|vₘ) :param input_data: A tuple of time points and observations

local_objective_and_gradients(Fmu: tf.Tensor, Fvar: tf.Tensor, Y: tf.Tensor)tf.Tensor[source]

Returs the local_objective and its gradients wrt to the expectation parameters :param Fmu: means μ […, latent_dim] :param Fvar: variances σ² […, latent_dim] :param Y: observations Y […, observation_dim] :return: local objective and gradient wrt [μ, σ² + μ²]

local_objective(Fmu: tf.Tensor, Fvar: tf.Tensor, Y: tf.Tensor)tf.Tensor[source]

local loss in CVI :param Fmu: means […, latent_dim] :param Fvar: variances […, latent_dim] :param Y: observations […, observation_dim] :return: local objective […]