probabilistic machine learning for civil engineers 121
The likelihood function
f
(
D
y
|D
x
, ✓
) descri bes the joint prior
Theoretically, here, we should refer to
f
(
D
y
|D
x
, ✓
)asthemarginal likelihood be-
cause the hidden variables
g
are marginal-
ized.
probability density of observations D
y
, given a set of parameters ✓,
f(D
y
|D
x
, ✓)=
Z
f(D
y
|g) ·f(g|D
x
, ✓)dg = N(D
y
; µ
Y
, ⌃
Y
).
The l og-l i kelihood for a vector of observations
y
=[
y
1
y
2
··· y
D
]
|
is
given by
ln f (D
y
|D
x
, ✓)=lnN(y; µ
Y
, ⌃
Y
)
=
1
2
(yµ
Y
)
|
⌃
1
Y
(yµ
Y
)
1
2
ln | ⌃
Y
|
D
2
ln 2⇡.
Figure 8.22: Log-likelihoo d maximization.
The optim al parameter s
✓
⇤
correspond to the parameter values for
which the derivative of the likelihood equals z er o, as illustrated in
figure 8.22. For µ
Y
= 0, the derivative of the l og-l i kelihood is
@
@✓
j
ln f (D
y
|D
x
, ✓)=
1
2
y
|
⌃
1
Y
@⌃
Y
@✓
j
⌃
1
Y
y
1
2
tr
✓
⌃
1
Y
@⌃
Y
@✓
j
◆
.
Efficient gradient-based optimization algorithms for learning pa-
rameters by maximizing the log-likelihood are already implemented
open-source packages such as the GPML
3
,GPStu↵
4
, and pyGPs
5
3
Rasmussen, C. E. and H. Nickisch (2010).
Gaussian processes for machine learning
(GPML) toolbox. The Journal of Machine
Learning Research 11,3011–3015
4
Vanhatalo, J., J. Riihim¨aki, J. Har-
tikainen, P. Jyl¨anki, V. Tolvanen, and
A. Vehtari (2013). GPstu↵: Bayesian mod-
eling with Gaussian processes. Journal of
Machine Learning Research 14,1175–1179
5
Neumann, M., S. Huang, D. E. Marthaler,
and K. Kersting (2015). pyGPs: A python
library for Gaussian process regression and
classification. The Journal of Machine
Learning Research 16 (1), 2611–2616
toolboxes.
8.2.5 Example: Soil Contamination Characterization
This sect i on presents an applied example of GPR for the character-
ization of contaminant c onc entration in soils. Take, for example, a
set of observations y
i
of soil contaminant concentrations,
D = {D
x
, D
y
} = {(l
i
,y
i
),i2{1 : 116}},
where
l
i
is a covariate vector associated with each concentration
observation. These covariates denote the geographic coordinates
(longitude, latitude, and depth) of e ach observation,
l
i
=[xyz]
|
i
| {z }
geodesic coordinates
.
Each observation is represented in figure 8.23 by a circle where its
size is proportional to the contaminant concentration. The goal
is to employ Gaussian process regressi on in order to model the
contaminant concentration across the w hole soil volume, given the
available observations for 116 discrete locations.
0
100
200
x [m]
300
100
50
400
0
-50
500
y [m]
Figure 8.23: Example of application of
GPR for soil contamination characteri-
zation. The iso-contours correspond to
Pr
([ ]
>
55
mg/kg
)=0
.
5, and the size of
each circle is proportional to the observed
contaminant concentration.
(Adapted from Quach et al. (2017).)
Because a concentration must be a positive number, and be-
cause several of the observations available are close to zero, t he
observation model is defined in t he log-space,
ln y
|{z}
observation
=
true contaminant [ ] in log space
z}|{
g(l)+ v
|{z}
meas. error in log space
,