markovflow.kernels.sde_kernel
Module containing Stochastic Differential Equation (SDE) kernels.
SDEKernel
Bases: markovflow.kernels.kernel.Kernel, abc.ABC
markovflow.kernels.kernel.Kernel
abc.ABC
Abstract class representing kernels defined by the Stochastic Differential Equation:
For most kernels \(F, L, H\) are not time varying; these have the more restricted form:
…with \(w(t)\) white noise process with spectral density \(Q_c\), where:
See the documentation for the StationaryKernel class.
StationaryKernel
Usually:
…for some \(a(t)\), so the state dimension represents the degree of the stochastic differential equation in terms of \(a(t)\). Writing it in the above form is a standard trick for converting a higher order linear differential equation into a first order linear one.
Since \(F, L, H\) are constant matrices, the solution can be written analytically. For a given set of time points \(tₖ\), we can solve this SDE and define a state space model of the form:
…where:
If \(Δtₖ = tₖ₊₁ - tₖ\), then the transition matrix \(Aₜ\) between states \(x(tₖ)\) and \(x(tₖ₊₁)\) is given by:
The process noise covariance matrix \(Qₖ\) between states \(x(tₖ)\) and \(x(tₖ₊₁)\) is given by:
We can write this in terms of the steady state covariance \(P∞\) as:
We also define an emission model for a given output dimension:
output_dim – The output dimension of the kernel.
jitter – A small non-negative number to add into a matrix’s diagonal to maintain numerical stability during inversion.
output_dim
Return the output dimension of the kernel.
build_finite_distribution
Return the GaussMarkovDistribution that this kernel represents on the provided time points.
GaussMarkovDistribution
Note
Currently the only representation we can use is StateSpaceModel.
StateSpaceModel
time_points – The times between which to define the distribution, with shape batch_shape + [num_data].
batch_shape + [num_data]
state_space_model
Return the StateSpaceModel that this kernel represents on the provided time points.
time_points – The times between which to define the state space model, with shape batch_shape + [num_data]. This must be strictly increasing.
generate_emission_model
Generate the EmissionModel associated with this kernel that maps from the latent StateSpaceModel to the observations.
EmissionModel
For any SDEKernel, the state representation is usually:
In this case, we are interested only in the first element of \(x\). That is, the output \(f(t)\) is given by \(f(t) = a(t)\), so \(H\) is given by \([1, 0, 0, ...]\).
If different behaviour is required, this method should be overridden.
time_points – The time points over which the emission model is defined, with shape batch_shape + [num_data].
initial_mean
Return the initial mean of the generated StateSpaceModel.
This will usually be zero, but can be overridden if necessary.
batch_shape – Leading dimensions for the initial mean.
A tensor of zeros with shape batch_shape + [state_dim].
batch_shape + [state_dim]
state_dim
Return the state dimension of the generated StateSpaceModel.
initial_covariance
Return the initial covariance of the generated StateSpaceModel.
For stationary kernels this is typically the covariance of the stationary distribution for \(x, P∞\).
In the general case the initial covariance depends on time, so we need the initial_time_point to generate it.
initial_time_point
initial_time_point – The time_point associated with the first state, with shape batch_shape + [1,].
batch_shape + [1,]
A tensor with shape batch_shape + [state_dim, state_dim].
batch_shape + [state_dim, state_dim]
transition_statistics_from_time_points
Generate the transition matrices when the time deltas are between adjacent time_points.
time_points
time_points – A tensor of times at which to produce matrices, with shape batch_shape + [num_transitions + 1].
batch_shape + [num_transitions + 1]
A tuple of two tensors, with respective shapes batch_shape + [num_transitions, state_dim, state_dim] batch_shape + [num_transitions, state_dim, state_dim].
batch_shape + [num_transitions, state_dim, state_dim]
transition_statistics
Return the state_transitions() and process_covariances() together to save having to compute them twice.
state_transitions()
process_covariances()
transition_times – A tensor of times at which to produce matrices, with shape batch_shape + [num_transitions].
batch_shape + [num_transitions]
time_deltas – A tensor of time gaps for which to produce matrices, with shape batch_shape + [num_transitions].
A tuple of two tensors, with respective shapes batch_shape + [num_transitions, state_dim, state_dim]. batch_shape + [num_transitions, state_dim, state_dim].
state_offsets
Return the state offsets \(bₖ\) of the generated StateSpaceModel.
A tensor with shape batch_shape + [num_transitions, state_dim].
batch_shape + [num_transitions, state_dim]
state_transitions
Return the state transition matrices of the generated StateSpaceModel \(Aₖ = exp(FΔtₖ)\).
transition_times – Time points at which to produce matrices, with shape batch_shape + [num_transitions].
time_deltas – Time gaps for which to produce matrices, with shape batch_shape + [num_transitions].
A tensor with shape batch_shape + [num_transitions, state_dim, state_dim].
process_covariances
Return the process covariance matrices of the generated StateSpaceModel.
The process covariance at time \(k\) is calculated as:
These transition matrices can be overridden for more specific use cases if necessary.
jitter_matrix
Jitter to add to the output of process_covariances() and initial_covariance() shape.
initial_covariance()
A tensor with shape [state_dim, state_dim].
[state_dim, state_dim]
__add__
Operator for combining kernel objects by summing them.
__mul__
Operator for combining kernel objects by multiplying them.
Bases: SDEKernel, abc.ABC
Abstract class representing stationary kernels defined by the Stochastic Differential Equation:
For most kernels \(H\) will not be time varying; that is, \(f(t) = H x(t)\).
state_mean – A tensor with shape [state_dim,].
set_state_mean
Sets the state mean for the kernel.
trainable – Boolean value to set the state mean trainable.
For stationary kernels this is the covariance of the stationary distribution for \(x,P∞\) and is independent of the time passed in.
initial_time_point – The time point associated with the first state, with shape batch_shape + [1,].
Return state_transitions() and process_covariances() together to save having to compute them twice.
By default this uses the state transitions to calculate the process covariance:
feedback_matrix
Return the feedback matrix \(F\). This is where:
\(dx = F (x - m)dt o x(t) = A x(0) + (I-A)m\)
A tensor with shape batch_shape + [num_transitions, state_dim]
steady_state_covariance
Return the steady state covariance \(P∞\), given implicitly by:
state_mean
Return the state mean.
A tensor with shape [state_dim,].
[state_dim,]
NonStationaryKernel
Abstract class representing non-stationary kernels defined by the Stochastic Differential Equation:
feedback_matrices
The non-stationary feedback matrix \(F(t)\) at times \(t\), where:
time_points – The times at which the feedback matrix is evaluated, with shape batch_shape + [num_time_points].
batch_shape + [num_time_points]
A tensor with shape batch_shape + [num_time_points, state_dim, state_dim].
batch_shape + [num_time_points, state_dim, state_dim]
This will usually be zero, but can be overridden if necessary. :param transition_times: A tensor of times at which to produce matrices, with shape
batch_shape + [num_transitions].
ConcatKernel
Bases: StationaryKernel, abc.ABC
Abstract class implementing the state space model of multiple kernels that have been combined together. Combined with differing emission models this can give rise to the Sum kernel or to a multi-output kernel.
Sum
The state space of any ConcatKernel consists of all the state spaces of child kernels concatenated (in the tensorflow.concat sense) together:
So the SDE of the kernel becomes:
kernels – A list of child kernels that will have their state spaces concatenated together.
kernels
Return a list of child kernels.
The state transition matrix is the block diagonal matrix of the child state transition matrices.
The combined mean is the child means concatenated together:
…to form a longer mean vector.
batch_shape – A tuple of leading dimensions for the initial mean.
Return the feedback matrix. This is the block diagonal matrix of child feedback matrices.
Return the steady state covariance. This is the block diagonal matrix of child steady state covariance matrices.
Bases: ConcatKernel
Sums a list of child kernels.
There are two ways to implement this kernel: Stacked and Concatenated.
This class implements the Concatenated version, where the state space of the Sum kernel includes covariance terms between the child kernels.
Generate the emission matrix \(H\). This is the concatenation:
…where \(\{Hᵢ\}ₙ\) are the emission matrices of the child kernels. Thus the state dimension for this kernel is the sum of the state dimension of the child kernels.
The emission model associated with this kernel, with emission matrix with shape batch_shape + [num_data, output_dim, state_dim].
batch_shape + [num_data, output_dim, state_dim]
Product
Bases: StationaryKernel
Multiplies a list of child kernels.
The feedback matrix is the Kronecker product of the feedback matrices from the child kernels. We will use a product kernel with two child kernels as an example. Let \(A\) and \(B\) be the feedback matrix from these two child kernels. The feedback matrix \(F\) of the product kernel is:
…where \(⊗\) is the Kronecker product operator.
The state transition matrix is the Kronecker product of the state transition matrices from the child kernels. Let \(Aₖ\) and \(Bₖ\) be the state transition matrix from these two child kernels at time step \(k\). The state transition matrix \(Sₖ\) of the product kernel is:
The steady state covariance matrix is the Kronecker product of the steady covariance matrix from the child kernels. Let \(A∞\) and \(B∞\) be the steady covariance matrix from these two child kernels. The state transition matrix \(P∞\) of the product kernel is:
The process covariance matrix \(Qₖ\) at time step \(k\) is calculated using the same formula as defined in the parent class SDEKernel:
…where the steady state matrix \(P∞\) and the state transition \(Sₖ\) are defined above.
kernels – An iterable over the kernels to be multiplied together.
Return the state transition. This is the Kronecker product of the child state transitions.
transition_times – A tensor of times at which to produce matrices, shape batch_shape + [num_transitions].
time_deltas – A tensor of time gaps for which to produce matrices, shape batch_shape + [num_transitions].
Return the feedback matrix. This is the Kronecker product of the child feedback matrices.
Return the steady state covariance. This is the Kronecker product of the child steady state covariances.
Generate the emission matrix. This is the Kronecker product of all the child emission matrices.
IndependentMultiOutput
Takes a concatenated state space model consisting of multiple child kernels and projects the state space associated with each kernel into a separate observation vector.
The result is similar to training several kernels on the same data separately, except that because of the covariance terms in the state space there can be correlation between the separate observation vectors.
kernels – An iterable over child kernels which will have their state spaces concatenated together.
Generate the emission matrix \(H\). This is the direct sum of the child emission matrices, for example:
…where \(\{Hᵢ\}ₙ\) are the emission matrices of the child kernels.
The emission model associated with this kernel.
FactorAnalysisKernel
Produces an emission model which performs a linear mixing of Gaussian processes according to a known time varying weight function and a learnable loading matrix:
\(\{fᵢ\}ₙ\) are the observable processes \(\{gₖ\}ₘ\) are the latent GPs \(A^{n × m}\) is a known, possibly time dependant, weight matrix \(B^{m × m}\) is either the identity or a trainable loading matrix
\(\{fᵢ\}ₙ\) are the observable processes
\(\{gₖ\}ₘ\) are the latent GPs
\(A^{n × m}\) is a known, possibly time dependant, weight matrix
\(B^{m × m}\) is either the identity or a trainable loading matrix
weight_function – A function that, given TensorType time points with shape batch_shape + [num_data, ], returns a weight matrix with the relative mixing of the tensors, with shape batch_shape + [num_data, output_dim, n_latents].
TensorType
batch_shape + [num_data, ]
batch_shape + [num_data, output_dim, n_latents]
kernels – An iterable over child kernels that will have their state spaces concatenated together, with shape [n_latents, ].
[n_latents, ]
output_dim – The output dimension of the kernel. This should have the same shape as the output_dim of the weight matrix returned by the weight function.
trainable – Whether the loading matrix \(B\) should be trainable.
Generate the emission matrix \(WH\). This is where:
…as per the multi-output kernel, and \(W = AB\).
time_points – The time points over which the emission model is defined, with shape batch_shape + [num_data, ].
StackKernel
Implements the state space model of multiple kernels that have been combined together. Unlike a ConcatKernel, it manages the multiple kernels by introducing a leading dimension (stacking), rather than forming a block diagonal form of each parameter explicitly.
The prior of both a StackKernel and a ConcatKernel is the same (independent). However, posterior state space models built upon a StackKernel will maintain this independency, in contrast to the posteriors building upon a ConcatKernel, which model correlations between the processes.
Combined with different emission models this can give rise to a multi-output stack kernel, and perhaps in the future an additive kernel.
The state space of this kernel consists of all the state space of the child kernels stacked (in the tensorflow.stack sense) together, with padded zeros when the state space of one of the kernels is larger than any of the others:
[ x₁⁽¹⁾(t) ] ᨞ [ 0 ] [ x₁⁽ᵐ⁾(t) ] ᨞ [ x₂⁽ᵐ⁾(t) ]
…where \(m\) are the number of kernels / outputs.
dx(t)/dt = [F⁽¹⁾] ᨞ [x⁽¹⁾(t)] ᨞ + [L⁽¹⁾] ᨞ [w⁽¹⁾(t)] ᨞ ᨞ [F⁽ᵐ⁾] ᨞ [x⁽ᵐ⁾(t)] ᨞ [L⁽ᵐ⁾] ᨞ [w⁽ᵐ⁾(t)] f(t) = [H⁽¹⁾] ᨞ [x⁽¹⁾(t)] ᨞ ᨞ [H⁽ᵐ⁾] ᨞ [x⁽ᵐ⁾(t)]
kernels – A list of child kernels that will have their state spaces concatenated together. Since we model each output independently, the length of the kernel list defines the number of the outputs. Note that each kernel should have individual output_dim 1.
_check_batch_shape_is_compatible
Helper method to check the compatibility of batch_shape. For the StackKernel the batch_shape must have the following shape:
(…, num_kernels)
In any other case this method raises a tf.errors.InvalidArgumentError.
batch_shape – a tuple with the shape to check
The state transition matrix is the stacked matrix of the child state transition matrices, padded with zeros (if necessary) to match the largest state dim across kernels.
time_deltas – A tensor of time gaps for which to produce matrices, with shape batch_shape + [num_transitions] where batch_shape = (..., num_kernels).
batch_shape = (..., num_kernels)
A tensor with shape batch_shape + [num_transitions, state_dim, state_dim] where batch_shape = (..., num_kernels).
We override SDEKernel.initial_mean() from the parent class to check there is a compatible batch_shape.
SDEKernel.initial_mean()
batch_shape
batch_shape – A tuple of leading dimensions for the initial mean, where batch_shape can be (..., num_kernels).
(..., num_kernels)
A tensor of zeros with shape batch_shape + [state_dim], where batch_shape = (..., num_kernels).
We override SDEKernel.state_offsets() from the parent class to check there is a compatible batch_shape.
SDEKernel.state_offsets()
Return the feedback matrix. This is the stacked matrix of child feedback matrices, padded with zeros to have matching state dims.
A tensor with shape [num_kernels, state_dim, state_dim].
[num_kernels, state_dim, state_dim]
Return the steady state covariance. This is the stacked matrix of child steady state covariance matrices, padded with the identity (if necessary) to have matching state dims.
Note that we further append a singleton dimensions after the num_kernels so it can broadcast across the number of data.
num_kernels
A tensor with shape [num_kernels, 1, state_dim, state_dim].
[num_kernels, 1, state_dim, state_dim]
This is typically the covariance of the stationary distribution for \(x, P∞\).
We override SDEKernel.initial_covariance() from the parent class to check there is a compatible batch_shape.
SDEKernel.initial_covariance()
initial_time_point – The time point associated with the first state, shape batch_shape + [1,].
A tensor with shape batch_shape + [state_dim, state_dim], where batch_shape = (..., num_kernels).
IndependentMultiOutputStack
Bases: StackKernel
Takes a stacked state space model consisting of multiple child kernels and projects the state space associated with each kernel into a separate observation vector.
The result is similar to training several kernels on the same data separately. There will be no correlations between the processes, in the prior or the posterior.
kernels – An iterable over child kernels which will have their state spaces concatenated together. Since we model each output independently the length of the kernel list defines the number of the outputs.
Generate the emission matrix \(H\). This is a stacking of the child emission matrices, which are first augmented (if necessary) so that they have the same state_dim.
time_points – The time points over which the emission model is defined, with shape batch_shape + [num_data] where batch_shape = (..., num_kernels).
Overrides the base class SDEKernel.__add__() method.
SDEKernel.__add__()
Overrides the base class SDEKernel.__mul__() method.
SDEKernel.__mul__()