probabilistic machine learning for civil engineers 193
ascent method such as the Newton-Raphson algorithm presented in
§5.2. Note that because Newton-Raphson is a conve x optimization
method, and because SSMs typically involve a non-convex (see
chapter 5) and non-identifiable (see §6.3.4) likelihood function, the
optimization is parti c ul arl y sensitive to initial values
✓
0
.Itisthus
essential to initiali ze parameters using engineering knowledge and
to try several initial starting locations .
Note that a closed-form expectation maximization method exists
for estimating the parameters of SSMs.
6
However, even if it is mat h -
6
Ghahramani, Z. and G. E. Hinton (1996).
Parameter estimation for linear dynamical
systems. Technical Report CRG-TR-96-2,
University of Toronto, Dept. of Computer
Science
ematically elegant, in the context of civil engineering applications
such as those presented in the following sections, i t is still sensitive
to the selection of the initial values and is thus easily trapped in
local maxima.
12.1.5 Limitations and Practical Considerations
There are some limitations to the Kalman filter formulation. First,
it is sensitive to numerical errors when the covariance is rapidly
reduced i n the measurement step, for example, when accurate
measurements are used or after a period with missing d at a, or
when ther e are orders of magnitude separating the eigenvalues
of the covariance matrix. Two alternative formulations that are
numericall y more robust are the UD filter and squar e -r oot filter.
The reader interested in these methods should consult specialized
textbooks such as the one by Simon
7
or Gibbs.
8
7
Simon, D. (2006). Optimal state estima-
tion: Kalman, H infinity, and nonlinear
approaches. Wiley
8
Gibbs, B. P. (2011). Advanced Kalman
filtering, least-squares and modeling: A
practical handbook. Wiley
A second key limiting aspect is that the formulation of the
Kalman filt e r and smoother are only applicable for linear observa-
tion and transition models. When problems at hand require nonlin-
ear models, there are two main types of alternatives: First, there
are the unscented Kalman filter
9
and extended Kalman filter,
10
9
Wan, E. A. and R. van der Merwe (2000).
The unscented Kalman filter for nonlinear
estimation. In Adaptive Systems for Signal
Processing, Communications, and Control
Symposium,153–158.IEEE
10
Ljung, L. (1979). Asymptotic behavior of
the extended Kalman filter as a parameter
estimator for linear systems. IEEE
Transactions on Automatic Control 24 (1),
36–50
which allow for an approximation of the Kalman algorithm while
still describing the hidden states using multivariate Normal random
variables. The second option is to employ sampling method s such
as the particle filter,
11
where the hidden-st at e variables are de-
11
Del Moral, P. (1997). Non-linear filtering:
interacting particle resolution. Comptes
Rendus de l’Acad´emie des Sciences—Series
I—Mathematics 325 (6), 653–658
scribed by particles (i.e., samples) that are propagated through the
transition model. Th e weight of each particle is updated using the
likelihood computed from the observation model. Particle filtering is
suited to esti mat e PDFs that are not well represented by the multi-
variate Normal. The reader interested in these nonlinear-compatible
formulations can consult Murphy’s
12
textbook for further details.
12
Murphy, K. P. (2012). Machine learning:
Aprobabilisticperspective.MITPress