# State space models and the Kalman filter

Linear state-space models are used in time-series analysis for filtering, prediction, and smoothing problems. They assume that the observations are generated linearly from a latent linear dynamical system. Although many real world processes are non-linear, the lineary makes the model easy to analyze and efficient to estimate. In addition, many non-linear systems can be approximated using linear models, thus the linear state-space model is an important tool for time-series applications.

Consider the basic structural model with a *local level* term and a *trend* term:

\[ y_t=\mu_t+\lambda_t w_t+\epsilon_t \]

\[ \mu_{t+1}=\mu_t+v_t+W_{1t} \]

\[ v_{t+1}=v_t+W_{2t} \]

\[ \lambda_{t+1}=\lambda_t+W_{3t} \]

where \(\epsilon_t \sim N(0,\sigma_{y}^{2})\), \(W_{1t} \sim N(0,\sigma_{\mu}^{2})\), and \(W_{2t} \sim N(0,\sigma_{v}^{2})\). Here we allow the local level (intercept) and the trend (slope) to vary in time. Note the term local here is in contrast to global, where the level \(\mu\) is fixed (\(\sigma_{\mu}^{2}=0\)) and there is a constant level across time.

In this case we have added an intervention variable \(\lambda\) and \(w\), where \(\lambda\) is a weighting term and \(w\) is a function where the value is zero before the intervention and unity after the intervention.

Setting all the noise terms \(\eta=(\epsilon_t, W_{1t}, W_{2t})\) to zero yields the simple equation of a line with constant intercept and slope. At \(t=1\):

\[ y_1=\mu_1 \]

\[ \mu_1=\mu_0+v_0 \]

\[ v_1=v_0 \]

\[ y_1=\mu_0 + v_0 \]

At \(t=2\):

\[ y_2=\mu_2 \]

\[ \mu_2=\mu_1+v_1=\mu_0+v_0+v_0 \]

\[ v_2=v_1=v_0 \]

\[ y_2=\mu_0+2v_0 \]

At \(t=3\):

\[y_3=\mu_2 \]

\[ \mu_3=\mu_2+v_2=\mu_0+v_0+v_0+b_0 \]

\[ v_3=v_2=v_1=v_0 \]

\[ y_3=\mu_0+3v_0 \]

Therefore, in this case the linear trend model simplifies to

\[ y_t=\mu_0+v_0g_t+\epsilon_t \]

where \(g_t=t\) for \(t=1,...,n\) is effectively time and \(\mu_0\) and \(v_0\) are the initial values of the level and the slope.

The state space model above can be expressed algebraically in one unified formulation. Using matrix algebra, these models can be written in the following general format:

\[ y_t=Z_{t}^{T}\alpha_t+\epsilon_t \]

\[ \alpha_{t+1}=T_t \alpha_t + R_t \eta_t \]

The first equation is the *observation* or *measurement* equation because it links the observed data with the unobserved latent state \(\alpha\). The second equation is the *transition* or *state* equation because it defines how the latent state evolves over time. \(\alpha\) is the *state vector*, \(Z_t\) is the *observation or design vector*, \(T_t\) is the *transition matrix*, \(R_t\) is usually an identity matrix and in cases where it is not identity \(R_t\) is called the *selection matrix*. Finally, \(\eta\) is *state disturbances*.

We can express the *local linear trend* model in state space form:

\[ \alpha_t=\begin{pmatrix}\mu_t\\v_t\end{pmatrix}, \quad \eta_t=\begin{pmatrix}\psi_t\\\zeta_t\end{pmatrix}, \quad T_t=\begin{bmatrix}1 & 1\\0 & 1\end{bmatrix}, \quad Z_t=\begin{pmatrix}1\\0\end{pmatrix} \]

\[ Q_t=\begin{bmatrix}\sigma_{\mu}^2 & 0\\0 & \sigma_{v}^2\end{bmatrix}, \quad R_t=\begin{bmatrix}1 & 0\\0 & 1\end{bmatrix} \]

The primary tool for fitting state space model to data is the *Kalman filter*, which recursively computes the predictive distribution \(p(\alpha_{t+1}\mid y_{1:t}))\) by combining \(p(\alpha_{t}\mid y_{1:t-1}))\) with \(y_t\) using a standard set of formulas that is logically equivalent to linear regression.

*Intervention variables* can be added to assess the influence of an external change or stimulus to the development in a time series. Three possible interventions are the *level shift*, *slope shift*, and a *pulse* where the value suddenly changes at the moment of the interventiona and than immediately returns to the value before the intervention took place. Changes in the value of level shift and slope shift are permanent after the intervention. A level shift can be expressed as follows:

\[ y_t=\mu_t+\lambda_t w_t+\epsilon_t \]

\[ \mu_{t+1}=\mu_t+W_{1t} \]

\[ \lambda_{t+1}=\lambda_t+W_{3t} \]

The dummy variable \(w_t\) equals zero at all time points before the intervention and equals unity at the time points after the intervention.

The state space equations can also be cast as a probabilistic model such that for the measurement model we have \(y_t\sim p(y_t \mid \alpha_t)\) and for the latent state model we have \(\alpha_t\sim p(\alpha_t \mid \alpha_{t-1})\).

Resources: