Misplaced Pages

Chapman–Kolmogorov equation

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Equation from probability theory

In mathematics, specifically in the theory of Markovian stochastic processes in probability theory, the Chapman–Kolmogorov equation (CKE) is an identity relating the joint probability distributions of different sets of coordinates on a stochastic process. The equation was derived independently by both the British mathematician Sydney Chapman and the Russian mathematician Andrey Kolmogorov. The CKE is prominently used in recent Variational Bayesian methods.

Mathematical description

Suppose that { fi } is an indexed collection of random variables, that is, a stochastic process. Let

p i 1 , , i n ( f 1 , , f n ) {\displaystyle p_{i_{1},\ldots ,i_{n}}(f_{1},\ldots ,f_{n})}

be the joint probability density function of the values of the random variables f1 to fn. Then, the Chapman–Kolmogorov equation is

p i 1 , , i n 1 ( f 1 , , f n 1 ) = p i 1 , , i n ( f 1 , , f n ) d f n {\displaystyle p_{i_{1},\ldots ,i_{n-1}}(f_{1},\ldots ,f_{n-1})=\int _{-\infty }^{\infty }p_{i_{1},\ldots ,i_{n}}(f_{1},\ldots ,f_{n})\,df_{n}}

i.e. a straightforward marginalization over the nuisance variable.

(Note that nothing yet has been assumed about the temporal (or any other) ordering of the random variables—the above equation applies equally to the marginalization of any of them.)

In terms of Markov kernels

If we consider the Markov kernels induced by the transitions of a Markov process, the Chapman-Kolmogorov equation can be seen as giving a way of composing the kernel, generalizing the way stochastic matrices compose. Given a measurable space ( X , A ) {\displaystyle (X,{\mathcal {A}})} and a Markov kernel k : ( X , A ) ( X , A ) {\displaystyle k:(X,{\mathcal {A}})\to (X,{\mathcal {A}})} , the two-step transition kernel k 2 : ( X , A ) ( X , A ) {\displaystyle k^{2}:(X,{\mathcal {A}})\to (X,{\mathcal {A}})} is given by

k 2 ( A | x ) = Y k ( A | x ) k ( d x | x ) {\displaystyle k^{2}(A|x)=\int _{Y}k(A|x')\,k(dx'|x)}

for all x X {\displaystyle x\in X} and A A {\displaystyle A\in {\mathcal {A}}} . One can interpret this as a sum, over all intermediate states, of pairs of independent probabilistic transitions.

More generally, given measurable spaces ( X , A ) {\displaystyle (X,{\mathcal {A}})} , ( Y , B ) {\displaystyle (Y,{\mathcal {B}})} and ( Z , C ) {\displaystyle (Z,{\mathcal {C}})} , and Markov kernels k : ( X , A ) ( Y , B ) {\displaystyle k:(X,{\mathcal {A}})\to (Y,{\mathcal {B}})} and h : ( Y , B ) ( Z , C ) {\displaystyle h:(Y,{\mathcal {B}})\to (Z,{\mathcal {C}})} , we get a composite kernel h k : ( X , A ) ( Z , C ) {\displaystyle h\circ k:(X,{\mathcal {A}})\to (Z,{\mathcal {C}})} by

( h k ) ( C | x ) = Y h ( C | y ) k ( d y | x ) {\displaystyle (h\circ k)(C|x)=\int _{Y}h(C|y)\,k(dy|x)}

for all x X {\displaystyle x\in X} and C C {\displaystyle C\in {\mathcal {C}}} .

Because of this, Markov kernels, like stochastic matrices, form a category.

Application to time-dilated Markov chains

When the stochastic process under consideration is Markovian, the Chapman–Kolmogorov equation is equivalent to an identity on transition densities. In the Markov chain setting, one assumes that i1 < ... < in. Then, because of the Markov property,

p i 1 , , i n ( f 1 , , f n ) = p i 1 ( f 1 ) p i 2 ; i 1 ( f 2 f 1 ) p i n ; i n 1 ( f n f n 1 ) , {\displaystyle p_{i_{1},\ldots ,i_{n}}(f_{1},\ldots ,f_{n})=p_{i_{1}}(f_{1})p_{i_{2};i_{1}}(f_{2}\mid f_{1})\cdots p_{i_{n};i_{n-1}}(f_{n}\mid f_{n-1}),}

where the conditional probability p i ; j ( f i f j ) {\displaystyle p_{i;j}(f_{i}\mid f_{j})} is the transition probability between the times i > j {\displaystyle i>j} . So, the Chapman–Kolmogorov equation takes the form

p i 3 ; i 1 ( f 3 f 1 ) = p i 3 ; i 2 ( f 3 f 2 ) p i 2 ; i 1 ( f 2 f 1 ) d f 2 . {\displaystyle p_{i_{3};i_{1}}(f_{3}\mid f_{1})=\int _{-\infty }^{\infty }p_{i_{3};i_{2}}(f_{3}\mid f_{2})p_{i_{2};i_{1}}(f_{2}\mid f_{1})\,df_{2}.}

Informally, this says that the probability of going from state 1 to state 3 can be found from the probabilities of going from 1 to an intermediate state 2 and then from 2 to 3, by adding up over all the possible intermediate states 2.

When the probability distribution on the state space of a Markov chain is discrete and the Markov chain is homogeneous, the Chapman–Kolmogorov equations can be expressed in terms of (possibly infinite-dimensional) matrix multiplication, thus:

P ( t + s ) = P ( t ) P ( s ) {\displaystyle P(t+s)=P(t)P(s)\,}

where P(t) is the transition matrix of jump t, i.e., P(t) is the matrix such that entry (i,j) contains the probability of the chain moving from state i to state j in t steps.

As a corollary, it follows that to calculate the transition matrix of jump t, it is sufficient to raise the transition matrix of jump one to the power of t, that is

P ( t ) = P t . {\displaystyle P(t)=P^{t}.\,}

The differential form of the Chapman–Kolmogorov equation is known as a master equation.

See also

Citations

  1. Perrone (2024), pp. 10–11

Further reading

External links

Categories: