Model Calibration

Figure 13.1: Example of hard-coded ﬁnite-

element model for which we want to infer

the parameter values

✓

]

from sets

of observations (x

In civil engineering, model calibration is employed for the task of

estimating the parameters of hard-coded physics- based models using

empirical observations

{

(

)

, 8i 2{

D}}

, D

}

The term hard-coded physics-based refers to mathematical models

made of constitutive laws and rules that are themselves based on

physics or domain-speciﬁc knowledge. In contrast, models presented

in chapters 8, 9, and 12 are referred to as empirical models because

they are built on empirical data with little or no hard-coded rules

associated with the underlying physics behin d the phenomena stud-

ied. The hard-coded model of a system is described by a function

(

✓, x

), which d epends on the covariat e s

··· x

]

2 R

as well as its model paramete rs

✓

··· ✓

]

.Theobserved

system responses

, ··· ,y

}

depend on covari at es

, x

, ··· , x

}

. Figure 13.1 presents an example of a model

where the parameters

✓

describe the boundary conditions and the

initial tension in cables. For this example, the covariates

and

⇤

describe the locations where predictions are made, as well as the

type of predicti ons , t hat is, displacement, rotation, strain, and so

on. The vector

describes covariates for observed locations, and

⇤

those for the unobser ved locations.

Nomenclature

Observed system responses

= {y

, ··· ,y

}

Covariates

= {x

, x

, ··· , x

}

Parameter values

✓ =[✓

✓

··· ✓

]

Model predictions

g(✓, x)=[g(✓, x

) ··· g(✓, x

)]

Model calibration is not in itself a subﬁeld of machine learning.

Nevertheless, i n th is chapter, we explore how the machine learning

methods presented in previous chapters can be employed to address

the challenges assoc i ate d wi t h the pr ob abi l i st i c in fe r en ce of mod el

parameters for hard-coded models. This application is classiﬁed

under the umbrella of unsupervised learning because, as we will

see, the main task consists in inferring hidden-state variables and

parameters that are t h ems el ves not observed.

Problem setups There are several typic al set u p s wher e we want

to employ em pi r i cal ob ser vations in conjunction wit h har d -coded

j.-a. goulet 214

models. For the example presented in ﬁgure 13.1, if we have a prior

knowledge for the joint distribution of model parameters

(

✓

), we

can propagate this prior knowledge through the model in order

to quantify the unce r tai nty associated with predi c ti on s. T hi s un -

certainty propagati on can be, for instance, performed using either

Monte Carlo sampling (see

6.5) or ﬁrst-order linearization (see

3.4.2). When the uncertainty in the prior knowledge

(

✓

) is weakly

informative, it may le ad t o a large variability in model predi ct i ons .

In that situation, it becomes interesting to employ empirical obser-

vations to reduce the unce r tai nty related to the prior knowledge of

model parameters. The key with model calibration is that model

parameters are typically not directly observable. For instance, in

ﬁgure 13.1 it is often not possible to directly measure the bound-

ary condition properties or to measure the cable internal tension.

These properties have t o be i n fe rr ed f r om the obse rved structural

responses

. Another key aspect is that we typically build physics-

based models

(

✓, x

) because they can predict quantities that

cannot be observed. For example, we may want to learn about

model parameters

✓

using observations of the st at i c re sponse of a

structure deﬁned by the covariates

and then employ the model

to predict the unobserved responses

(

✓, x

⇤

), deﬁned by other co-

variates

⇤

. The vector

⇤

may describe unobs er ved lo cat i on s or

quantities such as the d yn ami c behavior instead of the stat i c one

employed for parameter estimation. O n e last possible problem setup

is associated with the selection of model classes for describing a

phenomenon.

There are three main challenges associated with model calibra-

tion: observation errors, prediction errors, and model complexity.

The ﬁrst challenge aris es because the observations available

are in most cases contaminated by observation errors. The second

is due to the discrepancy between the prediction of hard-coded

physics-based models

(

✓, x

) and the real system responses; that

is, even w he n par amet e r values are known, the model remains an

approximation of the re ali ty. The third challenge is related to the

diﬃculties associated with choosing a model structure having the

right c om ple x i ty for the task at hand. In

13.1, we explore the

impact of these challenges on the least-squares model calibration

approach, which is stil l ex t en si vely used in practice. Then, the

subsequent sections explore how to address some of these chal-

lenges using a hierarchical Bayesian approach combining concepts

presented in chapters 6 and 8.

probabilistic machine learning for civil engineers 215

13.1 Least-Squares Model Calibration

What we will see in

13.1 is an example of

what not to do when calibrating a hard-

coded physics-based model as well as why

one should avoid it.

This section presents the pitfalls and limitations of the common

least-squares deterministic model calibration. With least-squares

model calibration, the goal is to ﬁnd the model parameters

✓

that minimize the sum of the square of the di↵erences between

predicted and measured values. The di↵erence between predicted

and observed val ue s at one obser ved location

is deﬁned as the

residual

r(✓, (x

)) = y

 g(✓, x

For the entire data set containing

data points, the least-squares

loss function is de ﬁn ed as

J(✓, D)=

i=1

r(✓, (x

))

= kr(✓, D)k

Because of the square exponent in the loss function, its value is

positive J(✓, D)  0. The task of identifying the optimal parameter s

✓

⇤

is formulated as a minimization problem where the goal is to

minimize the sum of the square of the residual at each observed

location,

✓

⇤

= arg min

✓

J(✓, D)

⌘ arg max

✓

 J(✓, D)

| {z }

Optimization problem

Gradient-based optimization methods such as those presented in

chapter 5 can be employed to i d entify ✓

⇤

K =1.75 ⇥ 10

N·mm/rad

L =10m

P =5kN

I =6.66 ⇥ 10

E =35GPa

Figure 13.2: The reference model

ˇg

(

E,x

)

that is employed to generate simulated

observations.

13.1.1 Illustrative Examples

This section presents three examples based on simulated data

using the reference beam model

ˇg

(

E,x

) presented in ﬁgure 13.2

along with the values for its par amet e rs . O bs er vations consist in

displacement measurements made for positions

2 and

. The ﬁrst example corresponds to the idealized context where

there is no observation or pred i ct i on er ror s because the model

employed to infer paramet er values is the same as the reference

model employed to generat e sy nthetic data. The second example

employs a si mp li ﬁ ed model to infer parameters. Finally, the third

example employs an overcomplex model to infer parameters. Note

that the last two examples i nc l ude obse r vation errors. For all three

cases, there is a single unknown parameter: the elastic modulus

which characterizes the ﬂexural sti↵ness of the beam.

j.-a. goulet 216

Example #1: Exact model without observation errors The ﬁrst

example illustrates an idealized context where there are no obser-

vation or prediction er r ors because the model employed t o inf er

parameter values

(

E,x

) is the same as the one employed to gener-

ate synthetic data

ˇg

(

E,x

). Figure 13.3 compares the predicted and

true deﬂection curves along with observed values. He re , because of

Example #1

Reference model

ˇg(

E,x)

z }| {

Interpretation model

g(E,x)

z }| {

the absence of model or measurement er ror s, the optimal parameter

also coincides with its true value

✓

⇤

, and the predicted

deﬂection obtained using the optimal parameter value

✓

⇤

coincides

with the true deﬂec t ion . Note that the los s function evaluated for

✓

⇤

equals zero. Even if least-squares model calibrati on worked perfectly

in this example, this good performance cannot be generalized to

real case studies because in this oversimpliﬁed example, there are

no model or measureme nt errors.

0 1 2 3 4 5 6 7 8 9 10

12

10

8

6

4

2

Position [m]

Deﬂection [mm]

Predicted response

True deﬂection

Observations

0 10 20 30 40

Parameter value [GPa]

Loss function, J(✓, D)

⇤

Figure 13.3: Least-squares model cali-

bration where the interpretation model

is identical to the reference model,

ˇg

(

E,x

(

E,x

), and where there are

no measurement errors.

Example #2

Reference model

ˇg(

E,x)

z }| {

Interpretation model

g(E,x)

z }| {

Example #2: S im pl iﬁ ed model with observation errors The second

example presents a more realistic case, where the model

(

E,x

)

employed to infer paramet er s contains a simpliﬁcation in compari-

son with the real system described by the reference model

ˇg

(

E,x

In this case, the simpliﬁcation consists in omitting the rotational

spring that is responsible for the nonzero initial rotation at the

support. In addition, observations are a↵ect e d by observation errors

: V ⇠N(v;0, 1) mm, where the observation model is d eﬁ n ed as

=ˇg(

E,x

)+v

?? V

, 8i 6= j.

Figure 13.4 compares the predicted and true deﬂection curves

along with observed values. We see that because of the combination

of model simpliﬁcation and observation errors, we obtain a higher

loss for the correct parameter value

= 35 GPa than for the

optimal value

⇤

= 22 GPa. As a result, the model predictions

are also biased for any prediction location

. This second example

illustrates how deterministic least-squares model calibration may

identify wrong parameter values as well as lead to inaccurate

predictions for measured and unmeasured locations.

probabilistic machine learning for civil engineers 217

0 1 2 3 4 5 6 7 8 9 10

12

10

8

6

4

2

Position [m]

Deﬂection [mm]

Predicted response

True deﬂection

Observations

0 10 20 30 40

Parameter value [GPa]

Loss function, J(✓, D)

⇤

Figure 13.4: Least-squares model calibra-

tion where the interpretation model is

simpliﬁed in comparison with the reference

model,

ˇg

(

E,x

)

(

E,x

), and where there

are measurement errors.

Example #3: Overcomplex model with observation errors The

third example illustrates the pitfalls of performing model calibra-

tion using a simpliﬁed yet overcomplex interpretation model, that

is, a model that contains simpliﬁcations yet also contains too many

parameters for the problem at hand. Here, overcomplexity is caused

by the use of two parameters for describing t h e el ast i c modu l us ,

and

. Like in the second exampl e, obs e r vations are a↵ected by

observation errors v

: V ⇠N(v;0, 1) mm.

Example #3

Reference model

ˇg(

E,x)

z }| {

Interpretation model

g(E

,x)

}| {

Figure 13.5 compares the predicted and true deﬂection curves

along with observed values. Note how the over-parameterization

allowed to wrongly match both observations. Parameter values

identiﬁed are biased in comparison with the true value, and the

predictions made by the model are also biased. Because in practice

the true deﬂection and parameter values remain unknown, the

biggest danger is overﬁtting wher e we could b el i eve that the model

is capable of providing accurate predictions because the calibration

wrongly appears to be perfect. In the context of least-squares model

calibration, this issue can be mitigated using cross-validation (see

8.1.2), where only subsets of observations are employed to tr ai n

the model, and then the remaining data is employed to test the

model’s predictive capacity. This p rocedure would reveal that th e

interpretat i on model is poorly deﬁned. In a more general way, we

can also use Bayesian model sele ct i on (s ee

6.8) for quantifying the

probability of each model cl ass .

0 1 2 3 4 5 6 7 8 9 10

12

10

8

6

4

2

Position [m]

Deﬂection [mm]

Predicted response

True deﬂection

Observations

0 25

Parameter value [GPa]

Loss function, J(✓, D)

⇤

Figure 13.5: Least-squares model calibra-

tion where the interpretation model is

overcomplex in comparison with the ref-

erence model,

ˇg

(

E,x

)

(

), and

where there are measurement errors.

j.-a. goulet 218

13.1.2 Limitations of Deterministic Model Calibration

Deterministic model calibrati on using methods such as least-squares

residual minimization performs well when model and measurement

errors are negligible and when data sets are large; the issue is that

in practical engineering applications, the errors associated with

measurements and especially hard-coded models are not negligible,

and data sets are typically small. As illustrated in ﬁgure 13.6, for

engineering systems, no matter how much e↵ort is devoted to

building a hard-coded physics-based model, it will always remain

a simpliﬁed version of the reality. Therefor e, m odel simpliﬁcations

introduce pre di ct i on err or s th at n eed t o be considered explicitly as

hidden variables to be inf er r ed in addition to mo d el parameters ✓.

Figure 13.6: In complex engineering

contexts, no matter how good a hard-coded

physics-based model is, it never perfectly

represents the reality. (Photo: Tamar

Bridge, adapted from Peter Edwards)

13.2 Hierarchical Bayesian Estimation

This section presents a Bayesian hierarchical method for jointly

estimating model parameters as well as prediction errors. The justi-

ﬁcation for this approach is that no matter how good a hard-coded

model is, it remains an approximation of the reality. Therefore,

in addition to the model parameters, we need to infer prediction

errors. There is typically an issue of non-identiﬁability (see

6.3.4)

between model parame t er s and pre di ct i on er r ors . As we will see,

the problem is made identiﬁable using prior knowledge. In addition,

note that because of observation errors, observation s

are not

exact representations of the reality either, that is,

model(✓) 6= reality

model(✓) + error

= reality 6= observation

model(✓)

| {z }

g (✓)

+ error

| {z }

+ error

|{z}

= reality + error

|{z}

= observation

| {z }

In the previous equation, observations

··· y

]

are

modeled by the sum of model predictions

(

✓, x

)

2 R

,which

are a function of covariat e s

··· x

]

2 R

X⇥D

and

model parameters

✓

··· ✓

]

2 R

; prediction errors

(

)=[

(

)

(

)

··· w

(

)]

2 R

, which are also a function of

covariates x; an d meas u rement errors v =[v

··· v

]

2 R

13.2.1 Joint Posterior Formulation

The model predictions

(

✓, x

) depend on the known covariates

describing the prediction locations and unknown model parame-

ters

✓

describing physical properties of the system. The observed

structural responses are described by

probabilistic machine learning for civil engineers 219

y = g(✓, x)+w(x)+v , (13.1)

Observations

Deterministic predictions from a hard-coded model

Model parameters

Covariates Measurement errors

Prediction errors

where measurement errors are assumed to be independent of the

covariates

. All the terms in equation 13.1 are considered as

deterministic quantities because, for a given set of observations

, D

}

{

(

)

, 8i 2{

D}}

, quantities

✓

(

), and

are not varying, yet values for

✓

(

), and

remain

unknown.

The prior knowledge for model parameters is described by the

random variable

⇥ ⇠ f

(

✓

;

), where

are the hyperparameters,

that is, parameters of the prior probability density function (PDF).

When hyperpar amet e rs

are unknown, the random variable

is describ ed by a hyper-prior

(

) and a hyper-posterior

(

that is, the prior and posterior PDFs of hyperparameters. Unlike

the parameter values

✓

, for which true values may exist, hyperpa-

rameters

are parameters of our prior knowledge, for which, by

nature, no true value exists. When hyperparame t er s

are assumed

to be ﬁxed constants that are not learned from data, they can be

implicitly included in the prior PDF formulation f(✓; z

)=f(✓).

The concept of prior, posterior, hyper-prior, and hyper-posteri or

not only applies to unknown parameters

✓

but also to prediction

errors

(

) and measurement errors

. Note that, unlike for hyper-

parameters

and

, a true value can exist for

.Whenitexists,

the true value for

can typically be in fe r re d fr om th e st at i st i cal

precision of the measuring instruments employed to obtain

.In

this section, in order to simplify the notation and to allow focusing

the attention on model parameters

✓

and prediction errors

(

we ass um e that z

and z

are known constants.

The formulation for the joint posterior PDF for unknown model

parameters

✓

, prediction errors

(

), and prediction errors prior

PDF hyperparameters z

follows:

posterior

z }| {

f(✓, w(D

), z

|D)=

likelihood

z }| {

f(D

|✓, w(D

), D

) ·

prior

z }| {

f(✓) · f (w(D

)|z

) ·

hyper-prior

z}|{

f(z

)

f(D

)

(13.2)

Model parameters

Hyperparameters for the prediction errors prior PDF

Prediction errors at location D

Data

j.-a. goulet 220

Likelihood The formulation for the joint posterior in equation 13.2

allows the explicit quantiﬁcation of the depen de nc e between model

parameters and prediction errors at observed locations. In the

case where several sets of model parameters and prediction errors

{✓, w

(

)

}

can equally explain the observations, then the problem

is non-identiﬁable (see

6.3.4) and several sets of values will end

up having an equal poster i or pr ob abi l ity, given that the prior

probability of each set is also equal. Here, the likelihood of data

is formulated considering that

(

✓, D

) and

(

) are unknown

constants, so the only aleatory quantities are the observation errors

V ⇠N(v; 0, ⌃

). The formulation of the likelihood is then

f(D

|✓, w(D

), D

)=f



(

constants

z }| {

g(✓, D

)+w(D

N(v;0,⌃

)

z}|{

V )=D



= N(y = D

;

z }| {

g(✓, D

)+w(D

⌃

z}|{

⌃

)

exp

✓





 (g(✓, D

)+w(D

))



⌃

1



 (g(✓, D

)+w(D

))



◆

(2⇡)

D/2

det ⌃

/ exp

✓





 (g(✓, D

)+w(D

)



⌃

1



 (g(✓, D

)+w(D

))



◆

=exp



i=1



 (g(✓, x

)+w(x

))





, for V

?? V

, 8i 6= j.

Prior It is necessary to deﬁne a mathematical formulation for the

prior PDF of prediction errors,

W ⇠ f

(

w|z

). A convenient choice

consists in describing the prior PDF for prediction errors using a

Gaussian process analogous to the one described in

8.2. With the

assumption that our model is unbiased, that is, the prior expected

value for prediction er r ors is

, this Gaussian process is expressed

W ⇠N(w; 0, ⌃

), where

[⌃

]

= 

) · 

) · ⇢( x

, x

, `), and



(x)=fct(x, z

⇢(x

, x

, `)=exp





 x

)

⌃

1

 x

)



=[z

The prediction error standard deviation



(

) is typically covariate

dependent, so that it needs to be represented by a problem-speciﬁc

function

fct

(

x, z

), where

are its parameters. The correlation

structure can be represented by a square-exponential covariance

function, where

⌃

is a diagonal matrix, parameterized by the

length-scale parameters

··· `

]

is the hyperpa-

rameter describing how the correlation decreases as the distance

probabilistic machine learning for civil engineers 221

(

k,i

 x

k,j

)

increases. Note that this choice for the correlation

structure is not exclusive and others can be employed as described

8.2. The hyperparame t er s regr ouped in

need to be learned

from data.

Example #4

Reference model

ˇg(

E,x)

z }| {

Interpretation model

g(E,x)

z }| {

= {5, 10}m

= {4.44, 11.28}mm

0 2 4 6 8 10

−12

−10

−8

−6

−4

−2

position [m]

Deﬂection [mm]

True deﬂection

Predicted

(a) Comparison of the real and pre-

dicted deﬂection obtained using the

true parameter value

0 2 4 6 8 10

−12

−10

−8

−6

−4

−2

position [m]

Deﬂection [mm]

(b) The prior knowledge for pre-

diction errors is described using a

Gaussian process.

Figure 13.7: Graphical representation

of the prior knowledge for prediction

errors arising from the use of a simpliﬁed

interpretation model.

Example #4: Joint posteri or es ti ma tio n We are revisiting example

#2, where the model employed to infer parameters

(

E,x

) contains

simpliﬁcations in comparison with the reference model

ˇg

(

E,x

)

describing the real system studied. In this case, the simpliﬁcation

consists in omitting the rotational spring allowing for a nonzero

initial rotation at the support. In addition, obs er vations are a↵ected

by independents and identically dist ri b ut e d obse r vation errors

: V ⇠N(v ;0, 1) mm.

Figure 13.7a presents the true deﬂection obtained from the

reference model as well as from the interpretation model using the

true parameter val u e

= 35 GPa. We can see that for any lo c ati on

other than

= 0, there is a systematic prediction error. The prior

knowledge for these the ore t ic al ly unknown prediction errors i s

modeled using a zer o-m ean Gaussian process,

f(w)=N(w ; 0, ⌃

where each term of the covariance matrix is deﬁ ne d as

[⌃

]



z }| {

(a · x

) ·



z }| {

(a · x

) ·

⇢

z }| {

exp

✓



 x

)

◆

. (13.3)

The prior knowledge for prediction errors is parameterized by

[

]

. Figure 13.7b presents the marginal

±

conﬁdence interval

around the predicted deﬂection. The prior knowledge for each pa-

rameter and hyperparameter is elastic modulus

(

; 20

) GPa,

model error prior scale factor

(

; 10

4

⇥

4

)

), and length-

scale

(

) m. Note th at al l of t hos e pr i ors ar e as su me d to be

independent from each other, so the joint prior can be described by

the product of the marginal s. The analytic formulation of the poste-

rior PDF for the unknown model parameter

✓

, the prediction

errors

(5)

(10), and the hyperparameters

a, `

is known up to a

constant, so that

f(E,w(5),w(10),a,`|D

) /N(y

; g(E,5) + w(5) , 1

) ···

·N(y

; g(E,10) + w(10) , 1

) ···

·N([w(5) w(10)]

; 0, ⌃

) ···

·N(E; 20, 20

) ···

·N(a; 10

4

, (5 ⇥ 10

4

)

) ···

·N(`;5, 50

j.-a. goulet 222

Using the Metropolis sampling method presented in

7.1, three par-

allel chains, contain in g a total of

= 10

joint s amp l es , are taken

from the unnormalized posterior

(

E,w

(5)

(10)

,a,`|D

), where

each joint sample is described by

{E,w

(5)

(10)

,a,`}

.Theﬁrst

half of each chain are discarded as burn-in samples; the Metropolis

acceptance rate is approximately 0.25, and the estimated potential

scale reduction (EPSR, i.e., chain stationarity metric; see

7.3.3)

is below 1.1 f or all parameters and hyperparamet er s. T he M arkov

chain Monte Carlo (MCMC) p os te r ior s amp le s are pre sented in

ﬁgure 13.8, where, when it exists, the true value is represented by

the symbol

⇤

. Note the almost perfect correlation in the posterior

between

(5) and

(10). It happens because, as illustrated in

ﬁgure 13.7, whenever the model either under- or overestimates the

displacement, it does for both locations. Notice that there is also

a clear dependence between prediction errors

and the elastic

modulus

, where small (large) valu es for

are compensated by

large positive (negative) prediction errors. These are examples of

non-identiﬁability regularized by prior knowledge, as described in

§6.3.4.

20 40 60 80

E [GPa]

20 40 60 80

0.5

1.5

·10

3

a [mm]

00.511.5

·10

3

a [mm]

20 40 60 80

100

150

l [m]

00.511.5

·10

3

100

150

050100150

l [m]

20 40 60 80

4

2

[mm]

00.511.5

·10

3

4

2

050100150

4

2

420 2 4

[mm]

20 40 60 80

5

[mm]

00.511.5

·10

3

5

050100150

5

420 2 4

5

50 5

[mm]

Figure 13.8: Bivariate scatter plots and

histograms describing the marginal poste-

rior PDFs. The symbol

⇤

indicates the true

values.

probabilistic machine learning for civil engineers 223

13.2.2 Predicting at Unobserved Locations

In practical applications, we are typically interested in predicting

the structure’s responses

at unobserved locations

⇤

. Analogous

to equation 13.1, the predicted values at unobserved locations are

deﬁned as

u = g(E,x

⇤

)+w(x

⇤

When predicting at unobserved locations, we want to consider

the knowledge obtained at observed locations, as described by the

posterior samples displayed in ﬁgure 13.8. When using MCMC

samples (see

7.5), we can app roximate the posterior predictive

expected values and covariances by following

E[U|D]=

u · f (u|D)du

⇡

s=1

(Posterior predictive mean)

cov(U|D)=E[(U  E[U|D])(U  E[U|D])

]

⇡

S  1

s=1

 E[U|D])(u

 E[U|D])

. (Posterior predictive covariance)

In order to compute

[

U|D

] and

cov

(

U|D

), we have to gener-

ate realizations of

from

(

u|D

) using the MCMC samples

{E,w

(5)

(10)

,a,`}

obtained from the unnormalized posterior

(

E,w

(5)

(10)

,a,`|D

). Through that process, we must ﬁrst

generate realizations for the covariance matrices

{⌃

⇤

, ⌃

w⇤

, ⌃

}

describing the model errors prior PDF. These realizations are then

used to update the posterior mean vector and covariance mat r i x

describing the model errors at prediction lo c at ion s, so that

(5)

(10)

;

⌃

⇤

⌃

w⇤

⌃

;

(

⇤|w

= ⌃

w⇤

(⌃

)

1

(5) w

(10)]

⌃

⇤|w

= ⌃

⇤

 ⌃

w⇤

(⌃

)

1

⌃

w⇤

Realizations of the model errors at prediction locations

s⇤

can be

Note: From equation 13.3,

[⌃

⇤

]



z }| {

·x

⇤

)·



z }| {

·x

⇤

)·

⇢

z }| {

exp

✓

⇤

x

⇤

)

◆

[⌃

w⇤

]

=(a

·x

)·(a

·x

⇤

)·exp

✓

x

⇤

)

◆

[⌃

]

=(a

·x

)·(a

·x

)·exp

✓

x

)

◆

sampled from the multivariate G aus si an PDF deﬁned by

s⇤

: W

⇤

⇠N(w

⇤

; µ

⇤|w

, ⌃

⇤|w

Finally, we can obtain a realization of the structure’s behavior at

predicted locations by summing the realization of model errors

s⇤

with the model predictions evaluated for the model parameter

MCMC sample E

= g(E

, x

⇤

)+w

s⇤

j.-a. goulet 224

Figure 13.9a presents a comparison of the posterior predictive

model prediction samples

with the deﬂection of the reference

model. Figure 13.9b presents the same comparison, this time using

the posterior predictive mean vector and con ﬁ de nc e interval. Notice

how the conﬁdence interval and the smoothness of the realiz ati on s

match the true deﬂection proﬁl e .

0 2 4 6 8

−20

−15

−10

−5

position [m]

deﬂection [mm]

(a)

0 2 4 6 8 10

−20

−15

−10

−5

position [m]

deﬂection [mm]

True deﬂection

Observations

(b)

Figure 13.9: Comparison of the posterior

predictive model predictions with the

displacement of the reference model.

The scatter plots in ﬁgure 13.8 show that signiﬁcant uncer-

tainty remains regardi n g th e param et e r

and prediction errors

(5)

(10). However, because there is a strong dependenc e in th e

joint posterior PDF describing these variables, the posterior un-

certainty for the deﬂe ct i on r emai n s limi t ed . It h ap pens because

abnormally small (large) values for the elastic modulus

result

in large (small) negative displacements, which are compensated for

by large positive (negative) predict ion err or s

(5)

(10). This

example illustrates why, when calibrating model parameters, it is

also necessary to infer model prediction errors because parameters

and prediction errors are typically dependent on each other.

The most diﬃcult part of such a hierarchical Bayesian estimation

procedure is to deﬁne an appropriate function for describing the

prior knowledge of model prediction errors. This aspect is what

is limiting its practical application to complex case studies where

the prediction errors depend not only on the prediction locations

but also on the prediction types, for example, rotation, strain,

acceleration, and so on. When several formulations for the prior

knowledge of predi c ti on errors are available, the probability of each

of them can be estimated using Bayesian model selection (see

6.8)

or cross-validation (se e

8.1.2). Furt he r de t ai ls about hierarchical

Bayesian estimation applied to model calibration can be found in

Kennedy and O’Hagan;

Brynjarsd´ottir and O’Hagan.

Kennedy, M. C. and A. O’Hagan (2001).

Bayesian calibration of computer models.

Journal of the Royal Statistical Society:

Series B (Statistical Methodology) 63(3),

425–464

Brynjarsd´ottir, J. and A. O’Hagan (2014).

Learning about physical parameters: The

importance of model discrepancy. Inverse

Problems 30 (11), 114007