markovflow.ssm_gaussian_transformations
Module transforming identities to and from expectation and natural parameters.
ssm_to_expectations
Transform a StateSpaceModel to the expectation parameters of the equivalent Gaussian distribution.
StateSpaceModel
The expectation parameters are defined as the expected value of the sufficient statistics \(𝔼[φ(x)]\), where \(φ(x)\) are the sufficient statistics. For the case of a Gaussian distribution that is described via a state space model they are given by:
The expectation parameters \(η\) and \(Η\) are therefore given by:
[μ₀ ] [μ₁ ] η = [⋮ ] [μₙ₋₁] [μₙ ], [Σ₀ + μ₀μ₀ᵀ Σ₀A₁ᵀ + μ₀μ₁ᵀ ] [A₁Σ₀ + μ₁μ₀ᵀ Σ₁ + μ₁μ₁ᵀ Σ₁A₂ᵀ + μ₁μ₂ᵀ ] H = [ ᨞ ᨞ Σₙ₋₁Aₙᵀ + μₙ₋₁μₙᵀ ] [ AₙΣₙ₋₁ + μₙμₙ₋₁ᵀ Σₙ + μₙμₙᵀ ],
…where:
\(μᵢ\) and \(Σᵢ\) are the marginal means and covariances at each time step \(i\) \(Aᵢ\) are the transition matrices of the state space model
\(μᵢ\) and \(Σᵢ\) are the marginal means and covariances at each time step \(i\)
\(Aᵢ\) are the transition matrices of the state space model
ssm – The object to transform to expectation parameters.
A tuple containing the 3 expectation parameters:
eta_linear corresponds to \(η\) with shape [..., N+1, D]
eta_linear
[..., N+1, D]
eta_diag corresponds to the block diagonal part of \(Η\) with shape [..., N+1, D, D]
eta_diag
[..., N+1, D, D]
eta_subdiag corresponds to the lower block sub-diagonal of \(Η\) with shape [..., N, D, D]
eta_subdiag
[..., N, D, D]
Note each returned object in the tuple is a TensorType.
TensorType
expectations_to_ssm_params
Transform the expectation parameters to parameters of a StateSpaceModel.
The covariance of the joint distribution is given by:
…which results in:
[Σ₀ Σ₀A₁ᵀ Σ₀A₁ᵀA₂ᵀ … ] [A₁Σ₀ Σ₁ Σ₁A₂ᵀ Σ₁A₂ᵀA₃ᵀ … ] Σ = [A₂A₁Σ₀ A₂Σ₁ Σ₂ Σ₂A₃ᵀ … ] [⋮ ⋮ ᨞ ᨞ ᨞ Σₙ₋₁Aₙᵀ ] [ … AₙΣₙ₋₁ Σₙ ],
\(Σᵢ\) are the marginal covariances at each time step \(i\) \(Aᵢ\) are the transition matrices of the state space model
\(Σᵢ\) are the marginal covariances at each time step \(i\)
If we denote by \(Σᵢᵢ₋₁\) the lower block sub-diagonal of the joint covariance, and by \(Σᵢᵢ\) the block diagonal of it, then we can get the state space model parameters using the following identities:
eta_linear – Corresponds to \(η\) with shape [..., N+1, D].
eta_diag – Corresponds to the block diagonal part of \(Η\) with shape [..., N+1, D, D].
eta_subdiag – Corresponds to the lower block sub-diagonal of \(Η\) with shape [..., N, D, D].
A tuple containing the 5 parameters of the state space model in the following order:
As corresponds to the transition matrices \(Aᵢ\) with shape [..., N, D, D]
As
offsets corresponds to the state offset vectors \(bᵢ\) with shape [..., N, D]
offsets
[..., N, D]
chol_initial_covariance corresponds to the Cholesky of \(P₀\) with shape [..., D, D]
chol_initial_covariance
[..., D, D]
chol_process_covariances corresponds to the Cholesky of \(Qᵢ\) with shape [..., N, D, D]
chol_process_covariances
initial_mean corresponds to the mean of the initial distribution \(μ₀\) with shape [..., D]
initial_mean
[..., D]
ssm_to_naturals
Transform a StateSpaceModel to the natural parameters of the equivalent Gaussian distribution.
The natural parameters \(θ\) and \(Θ\) are given by:
[P₀⁻¹μ₀ - A₁ᵀQ₁⁻¹b₁ ] [Q₁⁻¹b₁ - A₂ᵀQ₂⁻¹b₂ ] θ = [⋮ ] [Qₙ₋₁⁻¹bₙ₋₁ - AₙᵀQₙ⁻¹bₙ ] [Qₙ⁻¹bₙ ], [-½(P₀⁻¹ + A₁ᵀ Q₁⁻¹ A₁) A₁ᵀ Q₁⁻¹ ] [Q₁⁻¹ A₁ -½(Q₁⁻¹ + A₂ᵀ Q₂⁻¹ A₂) A₂ᵀ Q₂⁻¹ ] Θ = [ ᨞ ᨞ AₙᵀQₙ⁻¹ ] [ Qₙ⁻¹Aₙ -½Qₙ⁻¹ ]
\(bᵢ\), \(Aᵢ\) and \(Qᵢ\) are the state offsets, transition matrices and covariances of the state space model \(μ₀\) and \(P₀\) are the mean and covariance of the initial state
\(bᵢ\), \(Aᵢ\) and \(Qᵢ\) are the state offsets, transition matrices and covariances of the state space model
\(μ₀\) and \(P₀\) are the mean and covariance of the initial state
ssm – The object to transform to natural parameters.
A tuple containing the 3 natural parameters:
theta_linear corresponds to \(θ\) with shape [..., N+1, D].
theta_linear
theta_diag corresponds to the block diagonal part of \(Θ\) with shape [..., N+1, D, D].
theta_diag
theta_subdiag corresponds to the lower block sub-diagonal of \(Θ\) with shape [..., N, D, D]
theta_subdiag
ssm_to_naturals_no_smoothing
It is similar to ssm_to_naturals() but in this case the natural parameters do not contain information from the future (smoothing). The updates regarding the smoothing have been pushed into the partition function, as described in:
ssm_to_naturals()
@inproceedings{pmlr-v97-lin19b, title = {Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations}, author = {Lin, Wu and Khan, Mohammad Emtiyaz and Schmidt, Mark}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {3992--4002}, year = {2019}, url = {http://proceedings.mlr.press/v97/lin19b.html}, }
[P₀⁻¹μ₀ ] [Q₁⁻¹b₁ ] θ = [⋮ ] [Qₙ₋₁⁻¹bₙ₋₁ ] [Qₙ⁻¹bₙ ], [-½P₀⁻¹ A₁ᵀ Q₁⁻¹ ] [Q₁⁻¹ A₁ -½Q₁⁻¹ A₂ᵀ Q₂⁻¹ ] Θ = [ ᨞ ᨞ AₙᵀQₙ⁻¹ ] [ Qₙ⁻¹Aₙ -½Qₙ⁻¹ ]
theta_linear corresponds to \(θ\) with shape [..., N+1, D]
naturals_to_ssm_params
Transform the natural parameters to parameters of a StateSpaceModel.
The precision of the joint distribution is given by:
[-2Θ₀₀ -Θ₁₀ᵀ ] [-Θ₁₀ -2Θ₁₁ -Θ₂₁ᵀ ] P = [ ᨞ ᨞ -Θₙₙ₋₁ᵀ ] [ -Θₙₙ₋₁ -2Θₙₙ ],
…where \(Θᵢᵢ\) and \(Θᵢᵢ₋₁\) are the block diagonal and block sub-diagonal of the natural parameter \(Θ\):
[-½(P₀⁻¹ + A₁ᵀ Q₁⁻¹ A₁) A₁ᵀ Q₁⁻¹ ] [Q₁⁻¹ A₁ -½(Q₁⁻¹ + A₂ᵀ Q₂⁻¹ A₂) A₂ᵀ Q₂⁻¹ ] Θ = [ ᨞ ᨞ AₙᵀQₙ⁻¹ ] [ Qₙ⁻¹Aₙ -½Qₙ⁻¹ ],
…and where:
\(Aᵢ\) and \(Qᵢ\) are the state transition matrices and covariances of the state space model \(P₀\) is the covariance of the initial state
\(Aᵢ\) and \(Qᵢ\) are the state transition matrices and covariances of the state space model
\(P₀\) is the covariance of the initial state
Inverting the precision gives as the joint covariance matrix:
If we define as \(Σᵢᵢ₋₁\) the lower block sub-diagonal of the joint covariance, and as \(Σᵢᵢ\) the block diagonal of it, we can get the state transition matrices from:
We then follow the SpInGP paper and create the matrices:
[ I ] [P₀ ] [-A₁ I ] [ Q₁ ] A⁻¹ = [ ᨞ ᨞ ] Q = [ ᨞ ] [ -Aₙ I] [ Qₙ ]
…so that:
If we solve \((A⁻¹)⁻¹ P\) we get:
[P₀⁻¹ ] [-Q₁⁻¹A₁ Q₁⁻¹ ] (A⁻¹)⁻¹ P = Q⁻¹A⁻¹, Q⁻¹A⁻¹ = [ ᨞ ᨞ ] [ -Qₙ⁻¹Aₙ Qₙ⁻¹],
…where the block diagonal of \(Q⁻¹A⁻¹\) holds the process noise precisions \(Qᵢ⁻¹\) and the precision of the initial state \(P₀⁻¹\).
To get the offsets we follow a similar strategy but solve against \(θ\). First we write:
[P₀⁻¹μ₀ - A₁ᵀQ₁⁻¹b₁ ] [I -A₁ᵀ ][P₀⁻¹ ][μ₀] [Q₁⁻¹b₁ - A₂ᵀQ₂⁻¹b₂ ] [ I -A₂ᵀ ][ Q₁⁻¹ ][b₁] θ = [⋮ ] = [ ᨞ ᨞][ ᨞ ][⋮ ] [Qₙ⁻¹bₙ ] [ I][ Qₙ⁻¹][bₙ].
Then we solve \((A⁻ᵀ)⁻¹θ\) to get:
[P₀⁻¹ ][μ₀] [ Q₁⁻¹ ][b₁] (A⁻ᵀ)⁻¹θ = [ ᨞ ][⋮ ] [ Qₙ⁻¹][bₙ].
Finally, \(Q(A⁻ᵀ)⁻¹θ\):
[μ₀] [b₁] [⋮ ] = Q(A⁻ᵀ)⁻¹θ. [bₙ]
theta_linear – Corresponds to \(θ\) with shape [..., N+1, D].
theta_diag – Corresponds to the block diagonal part of \(Θ\) with shape [..., N+1, D, D].
theta_subdiag – Corresponds to the lower block sub-diagonal of \(Θ\) with shape [..., N, D, D].
naturals_to_ssm_params_no_smoothing
This is similar to naturals_to_ssm_params() but in this case the natural parameters do not contain information from the future (smoothing). The updates regarding the smoothing have been pushed into the partition function.
naturals_to_ssm_params()
We know that the natural parameters have the following form:
[-½P₀⁻¹ A₁ᵀ Q₁⁻¹ ] [Q₁⁻¹ A₁ -½Q₁⁻¹ A₂ᵀ Q₂⁻¹ ] Θ = [ ᨞ ᨞ AₙᵀQₙ⁻¹ ] [ Qₙ⁻¹Aₙ -½Qₙ⁻¹ ], [P₀⁻¹μ₀] [P₀⁻¹ ][μ₀] [Q₁⁻¹b₁] [ Q₁⁻¹ ][b₁] θ = [⋮ ] = [ ᨞ ][⋮ ] [Qₙ⁻¹bₙ] [ Qₙ⁻¹][bₙ],
So by inverting the block diagonal of \(Θ\) we get the process noise covariance matrices. Solving the block diagonal against the sub diagonal yields the state transition matrices. Solving the block diagonal of \(Θ\) against \(θ\) yields the state offsets and the initial mean.