j.-a. goulet 168
the probability of catching the vir us an d being sick
F
. The absence
of a link b etween
T
and
F
indicates that the temperature and
being sick from the flu are conditionally independen t from each
other. In the context of this example, conditional independence
implies that
T
and
F
are independent when
V
is known. The joint
probability for T , V , and F ,
p(f,v,t) =
=p(f|v,t)
z }| {
p(f|v) ·p(v|t) · p(t)
| {z }
p(v,t)
,
is obtained using the chain rule where, for each arrow, the cond i-
tional probabilities are described in a conditional probability table.
Virus example: Marginal and condi-
tional probability tables
p(t) = {p(cold),p(hot)} = {0.4, 0.6}
p(v|t) =
8
<
:
t =cold t =hot
v =yes 0.80.1
v =no 0.20.9
p(f|v) =
8
<
:
v =yes v =no
f =sick 0.70
f = ¬sick 0.31
Joint probability using chain rule
p(v, t) = p(v|t) · p(t)
=
8
<
:
t =cold t =hot
v =yes 0.8 ⇥ 0.40.1 ⇥ 0.6
v =no 0.2 ⇥ 0.40.9 ⇥ 0.6
=
8
<
:
t =cold t =hot
v =yes 0.32 0.06
v =no 0.08 0.54
p(f, v,t) = p(f |v) · p(v, t)
=
8
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
:
f =sick t =cold t =hot
v =yes 0.32⇥0.70.06⇥0.7
v =no 0.08 ⇥ 00.54 ⇥ 0
f = ¬sick t =cold t =hot
v =yes 0.32⇥0.30.06⇥0.3
v =no 0.08 ⇥ 10.54 ⇥ 1
=
8
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
:
f =sick t =cold t =hot
v =yes 0.224 0.042
v =no 00
f = ¬sick t =cold t =hot
v =yes 0.096 0.018
v =no 0.08 0.54
Variable elimination: Marginalization
p(f, t) =
X
v
p(f, v,t)
=
8
<
:
t =cold t =hot
f =sick 0.224 0.042
f = ¬sick 0.176 0.558
p(f) =
X
t
p(f, t)
=
⇢
f =sick 0.266
f = ¬sick 0.734
p(t|f) = p(f, t)/p(f)
=
8
<
:
t =cold t =hot
f =sick 0.84 0.16
f = ¬sick 0.24 0.76
In the case where we observe
F
=
f
, we can employ the
marginalization operation in order to obtain a conditional prob-
ability quantifying how the observation
f 2{sick, ¬sick}
changes
the probability for the tempe rat u re T ,
p(t|f) =
p(f,t)
p(f)
=
P
v
p(f,v,t)
P
t
P
v
p(f,v,t)
.
In minimalistic problems such as this one, it is trivial to calculate
the joint probability using the chain rule and eliminating variables
using marginalization. However, in practical cases involving dozens
of variables with as many links between them, these calcul at i ons
become computationally demanding. Moreover, in practice, we
seldom know the marginal and conditional probabilities tables.
A key interest of working with directed graphs is that efficient
estimation methods are available to perform all t h ose tasks.
Bayesian networks are applicable not only for discrete ran-
dom variables but also for continuous ones, or a mix of both. In
this chapter, we restrict ourselves to the study of BN for discrete
random variables. Note that the state-space models presented in
chapter 12 can be seen as a time-dependent Bayesian network using
Gaussian random variables with linear dependence models. This
chapter presents the nomenclature employed to define graphi c al
models, the methods for per for mi n g infe re nc e, and the methods
allowing us to learn the conditional probabilities defining the de-
pendencies between random variables. In addition, we present an
introduction to time-dependent Bayesian networks that are referred
to as dynamic Bayesi an networks. For advanced topics regarding
Bayesian networks, the reader is invited to consult specialized text-
book s such as the one by Nielsen and Jensen
3
or Murphy’s PhD
3
Nielsen, T. D. and F. V. Jensen (2007).
Bayesian networks and decision graphs.
Springer
thesis.
4
4
Murphy, K. P. (2002). Dynamic Bayesian
networks: representation, inference and
learning.PhDthesis,UniversityofCalifor-
nia, Berkeley