Misplaced Pages

Itô's lemma

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Itō's lemma) Identity in Itô calculus analogous to the chain rule This article is about a result in stochastic calculus. For the result in group theory, see Itô's theorem.

In mathematics, Itô's lemma or Itô's formula is an identity used in Itô calculus to find the differential of a time-dependent function of a stochastic process. It serves as the stochastic calculus counterpart of the chain rule. It can be heuristically derived by forming the Taylor series expansion of the function up to its second derivatives and retaining terms up to first order in the time increment and second order in the Wiener process increment. The lemma is widely employed in mathematical finance, and its best known application is in the derivation of the Black–Scholes equation for option values.

This result was discovered by Japanese mathematician Kiyoshi Itô in 1951.

Motivation

Suppose we are given the stochastic differential equation d X t = μ t   d t + σ t   d B t , {\displaystyle dX_{t}=\mu _{t}\ dt+\sigma _{t}\ dB_{t},} where Bt is a Wiener process and the functions μ t , σ t {\displaystyle \mu _{t},\sigma _{t}} are deterministic (not stochastic) functions of time. In general, it's not possible to write a solution X t {\displaystyle X_{t}} directly in terms of B t . {\displaystyle B_{t}.} However, we can formally write an integral solution X t = 0 t μ s   d s + 0 t σ s   d B s . {\displaystyle X_{t}=\int _{0}^{t}\mu _{s}\ ds+\int _{0}^{t}\sigma _{s}\ dB_{s}.}

This expression lets us easily read off the mean and variance of X t {\displaystyle X_{t}} (which has no higher moments). First, notice that every d B t {\displaystyle \mathrm {d} B_{t}} individually has mean 0, so the expected value of X t {\displaystyle X_{t}} is simply the integral of the drift function: E [ X t ] = 0 t μ s   d s . {\displaystyle \mathrm {E} =\int _{0}^{t}\mu _{s}\ ds.}

Similarly, because the d B {\displaystyle dB} terms have variance 1 and no correlation with one another, the variance of X t {\displaystyle X_{t}} is simply the integral of the variance of each infinitesimal step in the random walk: V a r [ X t ] = 0 t σ s 2   d s . {\displaystyle \mathrm {Var} =\int _{0}^{t}\sigma _{s}^{2}\ ds.}

However, sometimes we are faced with a stochastic differential equation for a more complex process Y t , {\displaystyle Y_{t},} in which the process appears on both sides of the differential equation. That is, say d Y t = a 1 ( Y t , t )   d t + a 2 ( Y t , t )   d B t , {\displaystyle dY_{t}=a_{1}(Y_{t},t)\ dt+a_{2}(Y_{t},t)\ dB_{t},} for some functions a 1 {\displaystyle a_{1}} and a 2 . {\displaystyle a_{2}.} In this case, we cannot immediately write a formal solution as we did for the simpler case above. Instead, we hope to write the process Y t {\displaystyle Y_{t}} as a function of a simpler process X t {\displaystyle X_{t}} taking the form above. That is, we want to identify three functions f ( t , x ) , μ t , {\displaystyle f(t,x),\mu _{t},} and σ t , {\displaystyle \sigma _{t},} such that Y t = f ( t , X t ) {\displaystyle Y_{t}=f(t,X_{t})} and d X t = μ t   d t + σ t   d B t . {\displaystyle dX_{t}=\mu _{t}\ dt+\sigma _{t}\ dB_{t}.} In practice, Ito's lemma is used in order to find this transformation. Finally, once we have transformed the problem into the simpler type of problem, we can determine the mean and higher moments of the process.

Derivation

We derive Itô's lemma by expanding a Taylor series and applying the rules of stochastic calculus.

Suppose X t {\displaystyle X_{t}} is an Itô drift-diffusion process that satisfies the stochastic differential equation

d X t = μ t d t + σ t d B t , {\displaystyle dX_{t}=\mu _{t}\,dt+\sigma _{t}\,dB_{t},}

where Bt is a Wiener process.

If f(t,x) is a twice-differentiable scalar function, its expansion in a Taylor series is

Δ f ( t ) d t d t = f ( t + d t , x ) f ( t , x ) = f t d t + 1 2 2 f t 2 ( d t ) 2 + + {\displaystyle {\frac {\Delta f(t)}{dt}}dt=f(t+dt,x)-f(t,x)={\frac {\partial f}{\partial t}}\,dt+{\frac {1}{2}}{\frac {\partial ^{2}f}{\partial t^{2}}}\,(dt)^{2}+\cdots +}
Δ f ( x ) d x d x = f ( t , x + d x ) f ( t , x ) = f x d x + 1 2 2 f x 2 ( d x ) 2 + {\displaystyle {\frac {\Delta f(x)}{dx}}dx=f(t,x+dx)-f(t,x)={\frac {\partial f}{\partial x}}\,dx+{\frac {1}{2}}{\frac {\partial ^{2}f}{\partial x^{2}}}\,(dx)^{2}+\cdots }

Then use the total derivative and the definition of the partial derivative f y = lim d y 0 Δ f ( y ) d y {\displaystyle f_{y}=\lim _{dy\to 0}{\frac {\Delta f(y)}{dy}}} :

d f = f t d t + f x d x = lim d x 0 , d t 0 f t d t + 1 2 2 f t 2 ( d t ) 2 + + f x d x + 1 2 2 f x 2 ( d x ) 2 + . {\displaystyle df=f_{t}dt+f_{x}dx=\lim _{dx\to 0,dt\to 0}{\frac {\partial f}{\partial t}}\,dt+{\frac {1}{2}}{\frac {\partial ^{2}f}{\partial t^{2}}}\,(dt)^{2}+\cdots +{\frac {\partial f}{\partial x}}\,dx+{\frac {1}{2}}{\frac {\partial ^{2}f}{\partial x^{2}}}\,(dx)^{2}+\cdots .}

Substituting x = X t {\displaystyle x=X_{t}} and therefore d x = d X t = μ t d t + σ t d B t {\displaystyle dx=dX_{t}=\mu _{t}\,dt+\sigma _{t}\,dB_{t}} , we get

d f = lim d B t 0 , d t 0 f t d t + 1 2 2 f t 2 ( d t ) 2 + + f x ( μ t d t + σ t d B t ) + 1 2 2 f x 2 ( μ t 2 ( d t ) 2 + 2 μ t σ t d t d B t + σ t 2 ( d B t ) 2 ) + . {\displaystyle df=\lim _{dB_{t}\to 0,dt\to 0}{\frac {\partial f}{\partial t}}\,dt+{\frac {1}{2}}{\frac {\partial ^{2}f}{\partial t^{2}}}\,(dt)^{2}+\cdots +{\frac {\partial f}{\partial x}}(\mu _{t}\,dt+\sigma _{t}\,dB_{t})+{\frac {1}{2}}{\frac {\partial ^{2}f}{\partial x^{2}}}\left(\mu _{t}^{2}\,(dt)^{2}+2\mu _{t}\sigma _{t}\,dt\,dB_{t}+\sigma _{t}^{2}\,(dB_{t})^{2}\right)+\cdots .}

In the limit d t 0 {\displaystyle dt\to 0} , the terms ( d t ) 2 {\displaystyle (dt)^{2}} and d t d B t {\displaystyle dt\,dB_{t}} tend to zero faster than d t {\displaystyle dt} . ( d B t ) 2 {\displaystyle (dB_{t})^{2}} is O ( d t ) {\displaystyle O(dt)} (due to the quadratic variation of a Wiener process which says B t 2 = O ( t ) {\displaystyle B_{t}^{2}=O(t)} ), so setting ( d t ) 2 , d t d B t {\displaystyle (dt)^{2},dt\,dB_{t}} and ( d x ) 3 {\displaystyle (dx)^{3}} terms to zero and substituting d t {\displaystyle dt} for ( d B t ) 2 {\displaystyle (dB_{t})^{2}} , and then collecting the d t {\displaystyle dt} terms, we obtain

d f = lim d t 0 ( f t + μ t f x + σ t 2 2 2 f x 2 ) d t + σ t f x d B t {\displaystyle df=\lim _{dt\to 0}\left({\frac {\partial f}{\partial t}}+\mu _{t}{\frac {\partial f}{\partial x}}+{\frac {\sigma _{t}^{2}}{2}}{\frac {\partial ^{2}f}{\partial x^{2}}}\right)dt+\sigma _{t}{\frac {\partial f}{\partial x}}\,dB_{t}}

as required.

Alternatively,

d f = lim d t 0 ( f t + σ t 2 2 2 f x 2 ) d t + f x d X t {\displaystyle df=\lim _{dt\to 0}\left({\frac {\partial f}{\partial t}}+{\frac {\sigma _{t}^{2}}{2}}{\frac {\partial ^{2}f}{\partial x^{2}}}\right)dt+{\frac {\partial f}{\partial x}}\,dX_{t}}

Geometric intuition

When X t + d t {\displaystyle X_{t+dt}} is a Gaussian random variable, f ( X t + d t ) {\displaystyle f(X_{t+dt})} is also approximately Gaussian random variable, but its mean E [ f ( X t + d t ) ] {\displaystyle E} differs from f ( E [ X t + d t ] ) {\displaystyle f(E)} by a factor proportional to f ( E [ X t + d t ] ) {\displaystyle f''(E)} and the variance of X t + d t {\displaystyle X_{t+dt}} .

Suppose we know that X t , X t + d t {\displaystyle X_{t},X_{t+dt}} are two jointly-Gaussian distributed random variables, and f {\displaystyle f} is nonlinear but has continuous second derivative, then in general, neither of f ( X t ) , f ( X t + d t ) {\displaystyle f(X_{t}),f(X_{t+dt})} is Gaussian, and their joint distribution is also not Gaussian. However, since X t + d t X t {\displaystyle X_{t+dt}\mid X_{t}} is Gaussian, we might still find f ( X t + d t ) f ( X t ) {\displaystyle f(X_{t+dt})\mid f(X_{t})} is Gaussian. This is not true when d t {\displaystyle dt} is finite, but when d t {\displaystyle dt} becomes infinitesimal, this becomes true.

The key idea is that X t + d t = X t + μ t d t + d W t {\displaystyle X_{t+dt}=X_{t}+\mu _{t}\,dt+dW_{t}} has a deterministic part and a noisy part. When f {\displaystyle f} is nonlinear, the noisy part has a deterministic contribution. If f {\displaystyle f} is convex, then the deterministic contribution is positive (by Jensen's inequality).

To find out how large the contribution is, we write X t + d t = X t + μ t d t + σ t d t z {\displaystyle X_{t+dt}=X_{t}+\mu _{t}\,dt+\sigma _{t}{\sqrt {dt}}\,z} , where z {\displaystyle z} is a standard Gaussian, then perform Taylor expansion. f ( X t + d t ) = f ( X t ) + f ( X t ) μ t d t + f ( X t ) σ t d t z + 1 2 f ( X t ) ( σ t 2 z 2 d t + 2 μ t σ t z d t 3 / 2 + μ t 2 d t 2 ) + o ( d t ) = ( f ( X t ) + f ( X t ) μ t d t + 1 2 f ( X t ) σ t 2 d t + o ( d t ) ) + ( f ( X t ) σ t d t z + 1 2 f ( X t ) σ t 2 ( z 2 1 ) d t + o ( d t ) ) {\displaystyle {\begin{aligned}f(X_{t+dt})&=f(X_{t})+f'(X_{t})\mu _{t}\,dt+f'(X_{t})\sigma _{t}{\sqrt {dt}}\,z+{\frac {1}{2}}f''(X_{t})(\sigma _{t}^{2}z^{2}\,dt+2\mu _{t}\sigma _{t}z\,dt^{3/2}+\mu _{t}^{2}dt^{2})+o(dt)\\&=\left(f(X_{t})+f'(X_{t})\mu _{t}\,dt+{\frac {1}{2}}f''(X_{t})\sigma _{t}^{2}\,dt+o(dt)\right)+\left(f'(X_{t})\sigma _{t}{\sqrt {dt}}\,z+{\frac {1}{2}}f''(X_{t})\sigma _{t}^{2}(z^{2}-1)\,dt+o(dt)\right)\end{aligned}}} We have split it into two parts, a deterministic part, and a random part with mean zero. The random part is non-Gaussian, but the non-Gaussian parts decay faster than the Gaussian part, and at the d t 0 {\displaystyle dt\to 0} limit, only the Gaussian part remains. The deterministic part has the expected f ( X t ) + f ( X t ) μ t d t {\displaystyle f(X_{t})+f'(X_{t})\mu _{t}\,dt} , but also a part contributed by the convexity: 1 2 f ( X t ) σ t 2 d t {\displaystyle {\frac {1}{2}}f''(X_{t})\sigma _{t}^{2}\,dt} .

To understand why there should be a contribution due to convexity, consider the simplest case of geometric Brownian walk (of the stock market): S t + d t = S t ( 1 + d B t ) {\displaystyle S_{t+dt}=S_{t}(1+dB_{t})} . In other words, d ( ln S t ) = d B t {\displaystyle d(\ln S_{t})=dB_{t}} . Let X t = ln S t {\displaystyle X_{t}=\ln S_{t}} , then S t = e X t {\displaystyle S_{t}=e^{X_{t}}} , and X t {\displaystyle X_{t}} is a Brownian walk. However, although the expectation of X t {\displaystyle X_{t}} remains constant, the expectation of S t {\displaystyle S_{t}} grows. Intuitively it is because the downside is limited at zero, but the upside is unlimited. That is, while X t {\displaystyle X_{t}} is normally distributed, S t {\displaystyle S_{t}} is log-normally distributed.

Mathematical formulation of Itô's lemma

In the following subsections we discuss versions of Itô's lemma for different types of stochastic processes.

Itô drift-diffusion processes (due to: Kunita–Watanabe)

In its simplest form, Itô's lemma states the following: for an Itô drift-diffusion process

d X t = μ t d t + σ t d B t {\displaystyle dX_{t}=\mu _{t}\,dt+\sigma _{t}\,dB_{t}}

and any twice differentiable scalar function f(t,x) of two real variables t and x, one has

d f ( t , X t ) = ( f t + μ t f x + σ t 2 2 2 f x 2 ) d t + σ t f x d B t . {\displaystyle df(t,X_{t})=\left({\frac {\partial f}{\partial t}}+\mu _{t}{\frac {\partial f}{\partial x}}+{\frac {\sigma _{t}^{2}}{2}}{\frac {\partial ^{2}f}{\partial x^{2}}}\right)dt+\sigma _{t}{\frac {\partial f}{\partial x}}\,dB_{t}.}

This immediately implies that f(t,Xt) is itself an Itô drift-diffusion process.

In higher dimensions, if X t = ( X t 1 , X t 2 , , X t n ) T {\displaystyle \mathbf {X} _{t}=(X_{t}^{1},X_{t}^{2},\ldots ,X_{t}^{n})^{T}} is a vector of Itô processes such that

d X t = μ t d t + G t d B t {\displaystyle d\mathbf {X} _{t}={\boldsymbol {\mu }}_{t}\,dt+\mathbf {G} _{t}\,d\mathbf {B} _{t}}

for a vector μ t {\displaystyle {\boldsymbol {\mu }}_{t}} and matrix G t {\displaystyle \mathbf {G} _{t}} , Itô's lemma then states that

d f ( t , X t ) = f t d t + ( X f ) T d X t + 1 2 ( d X t ) T ( H X f ) d X t , = { f t + ( X f ) T μ t + 1 2 Tr [ G t T ( H X f ) G t ] } d t + ( X f ) T G t d B t {\displaystyle {\begin{aligned}df(t,\mathbf {X} _{t})&={\frac {\partial f}{\partial t}}\,dt+\left(\nabla _{\mathbf {X} }f\right)^{T}\,d\mathbf {X} _{t}+{\frac {1}{2}}\left(d\mathbf {X} _{t}\right)^{T}\left(H_{\mathbf {X} }f\right)\,d\mathbf {X} _{t},\\&=\left\{{\frac {\partial f}{\partial t}}+\left(\nabla _{\mathbf {X} }f\right)^{T}{\boldsymbol {\mu }}_{t}+{\frac {1}{2}}\operatorname {Tr} \left\right\}\,dt+\left(\nabla _{\mathbf {X} }f\right)^{T}\mathbf {G} _{t}\,d\mathbf {B} _{t}\end{aligned}}}

where X f {\displaystyle \nabla _{\mathbf {X} }f} is the gradient of f w.r.t. X, HX f is the Hessian matrix of f w.r.t. X, and Tr is the trace operator.

Poisson jump processes

We may also define functions on discontinuous stochastic processes.

Let h be the jump intensity. The Poisson process model for jumps is that the probability of one jump in the interval is hΔt plus higher order terms. h could be a constant, a deterministic function of time, or a stochastic process. The survival probability ps(t) is the probability that no jump has occurred in the interval . The change in the survival probability is

d p s ( t ) = p s ( t ) h ( t ) d t . {\displaystyle dp_{s}(t)=-p_{s}(t)h(t)\,dt.}

So

p s ( t ) = exp ( 0 t h ( u ) d u ) . {\displaystyle p_{s}(t)=\exp \left(-\int _{0}^{t}h(u)\,du\right).}

Let S(t) be a discontinuous stochastic process. Write S ( t ) {\displaystyle S(t^{-})} for the value of S as we approach t from the left. Write d j S ( t ) {\displaystyle d_{j}S(t)} for the non-infinitesimal change in S(t) as a result of a jump. Then

d j S ( t ) = lim Δ t 0 ( S ( t + Δ t ) S ( t ) ) {\displaystyle d_{j}S(t)=\lim _{\Delta t\to 0}(S(t+\Delta t)-S(t^{-}))}

Let z be the magnitude of the jump and let η ( S ( t ) , z ) {\displaystyle \eta (S(t^{-}),z)} be the distribution of z. The expected magnitude of the jump is

E [ d j S ( t ) ] = h ( S ( t ) ) d t z z η ( S ( t ) , z ) d z . {\displaystyle E=h(S(t^{-}))\,dt\int _{z}z\eta (S(t^{-}),z)\,dz.}

Define d J S ( t ) {\displaystyle dJ_{S}(t)} , a compensated process and martingale, as

d J S ( t ) = d j S ( t ) E [ d j S ( t ) ] = S ( t ) S ( t ) ( h ( S ( t ) ) z z η ( S ( t ) , z ) d z ) d t . {\displaystyle dJ_{S}(t)=d_{j}S(t)-E=S(t)-S(t^{-})-\left(h(S(t^{-}))\int _{z}z\eta \left(S(t^{-}),z\right)\,dz\right)\,dt.}

Then

d j S ( t ) = E [ d j S ( t ) ] + d J S ( t ) = h ( S ( t ) ) ( z z η ( S ( t ) , z ) d z ) d t + d J S ( t ) . {\displaystyle d_{j}S(t)=E+dJ_{S}(t)=h(S(t^{-}))\left(\int _{z}z\eta (S(t^{-}),z)\,dz\right)dt+dJ_{S}(t).}

Consider a function g ( S ( t ) , t ) {\displaystyle g(S(t),t)} of the jump process dS(t). If S(t) jumps by Δs then g(t) jumps by Δg. Δg is drawn from distribution η g ( ) {\displaystyle \eta _{g}()} which may depend on g ( t ) {\displaystyle g(t^{-})} , dg and S ( t ) {\displaystyle S(t^{-})} . The jump part of g {\displaystyle g} is

g ( t ) g ( t ) = h ( t ) d t Δ g Δ g η g ( ) d Δ g + d J g ( t ) . {\displaystyle g(t)-g(t^{-})=h(t)\,dt\int _{\Delta g}\,\Delta g\eta _{g}(\cdot )\,d\Delta g+dJ_{g}(t).}

If S {\displaystyle S} contains drift, diffusion and jump parts, then Itô's Lemma for g ( S ( t ) , t ) {\displaystyle g(S(t),t)} is

d g ( t ) = ( g t + μ g S + σ 2 2 2 g S 2 + h ( t ) Δ g ( Δ g η g ( ) d Δ g ) ) d t + g S σ d W ( t ) + d J g ( t ) . {\displaystyle dg(t)=\left({\frac {\partial g}{\partial t}}+\mu {\frac {\partial g}{\partial S}}+{\frac {\sigma ^{2}}{2}}{\frac {\partial ^{2}g}{\partial S^{2}}}+h(t)\int _{\Delta g}\left(\Delta g\eta _{g}(\cdot )\,d{\Delta }g\right)\,\right)dt+{\frac {\partial g}{\partial S}}\sigma \,dW(t)+dJ_{g}(t).}

Itô's lemma for a process which is the sum of a drift-diffusion process and a jump process is just the sum of the Itô's lemma for the individual parts.

Non-continuous semimartingales

Itô's lemma can also be applied to general d-dimensional semimartingales, which need not be continuous. In general, a semimartingale is a càdlàg process, and an additional term needs to be added to the formula to ensure that the jumps of the process are correctly given by Itô's lemma. For any cadlag process Yt, the left limit in t is denoted by Yt−, which is a left-continuous process. The jumps are written as ΔYt = YtYt−. Then, Itô's lemma states that if X = (X, X, ..., X) is a d-dimensional semimartingale and f is a twice continuously differentiable real valued function on R then f(X) is a semimartingale, and

f ( X t ) = f ( X 0 ) + i = 1 d 0 t f i ( X s ) d X s i + 1 2 i , j = 1 d 0 t f i , j ( X s ) d [ X i , X j ] s + s t ( Δ f ( X s ) i = 1 d f i ( X s ) Δ X s i 1 2 i , j = 1 d f i , j ( X s ) Δ X s i Δ X s j ) . {\displaystyle {\begin{aligned}f(X_{t})&=f(X_{0})+\sum _{i=1}^{d}\int _{0}^{t}f_{i}(X_{s-})\,dX_{s}^{i}+{\frac {1}{2}}\sum _{i,j=1}^{d}\int _{0}^{t}f_{i,j}(X_{s-})\,d_{s}\\&\qquad +\sum _{s\leq t}\left(\Delta f(X_{s})-\sum _{i=1}^{d}f_{i}(X_{s-})\,\Delta X_{s}^{i}-{\frac {1}{2}}\sum _{i,j=1}^{d}f_{i,j}(X_{s-})\,\Delta X_{s}^{i}\,\Delta X_{s}^{j}\right).\end{aligned}}}

This differs from the formula for continuous semi-martingales by the additional term summing over the jumps of X, which ensures that the jump of the right hand side at time t is Δf(Xt).

Multiple non-continuous jump processes

There is also a version of this for a twice-continuously differentiable in space once in time function f evaluated at (potentially different) non-continuous semi-martingales which may be written as follows:

f ( t , X t 1 , , X t d ) = f ( 0 , X 0 1 , , X 0 d ) + 0 t f ˙ ( s , X s 1 , , X s d ) d s + i = 1 d 0 t f i ( s , X s 1 , , X s d ) d X s ( c , i ) + 1 2 i 1 , , i d = 1 d 0 t f i 1 , , i d ( s , X s 1 , , X s d ) d X s ( c , i 1 ) X s ( c , i d ) + 0 < s t [ f ( s , X s 1 , , X s d ) f ( s , X s 1 , , X s d ) ] {\displaystyle {\begin{aligned}f(t,X_{t}^{1},\ldots ,X_{t}^{d})={}&f(0,X_{0}^{1},\ldots ,X_{0}^{d})+\int _{0}^{t}{\dot {f}}({s_{-}},X_{s_{-}}^{1},\ldots ,X_{s_{-}}^{d})d{s}\\&{}+\sum _{i=1}^{d}\int _{0}^{t}f_{i}({s_{-}},X_{s_{-}}^{1},\ldots ,X_{s_{-}}^{d})\,dX_{s}^{(c,i)}\\&{}+{\frac {1}{2}}\sum _{i_{1},\ldots ,i_{d}=1}^{d}\int _{0}^{t}f_{i_{1},\ldots ,i_{d}}({s_{-}},X_{s_{-}}^{1},\ldots ,X_{s_{-}}^{d})\,dX_{s}^{(c,i_{1})}\cdots X_{s}^{(c,i_{d})}\\&{}+\sum _{0<s\leq t}\left\end{aligned}}}

where X c , i {\displaystyle X^{c,i}} denotes the continuous part of the ith semi-martingale.

Examples

Geometric Brownian motion

A process S is said to follow a geometric Brownian motion with constant volatility σ and constant drift μ if it satisfies the stochastic differential equation d S t = σ S t d B t + μ S t d t {\displaystyle dS_{t}=\sigma S_{t}\,dB_{t}+\mu S_{t}\,dt} , for a Brownian motion B. Applying Itô's lemma with f ( S t ) = log ( S t ) {\displaystyle f(S_{t})=\log(S_{t})} gives

d f = f ( S t ) d S t + 1 2 f ( S t ) ( d S t ) 2 = 1 S t d S t + 1 2 ( S t 2 ) ( S t 2 σ 2 d t ) = 1 S t ( σ S t d B t + μ S t d t ) 1 2 σ 2 d t = σ d B t + ( μ σ 2 2 ) d t . {\displaystyle {\begin{aligned}df&=f^{\prime }(S_{t})\,dS_{t}+{\frac {1}{2}}f^{\prime \prime }(S_{t})(dS_{t})^{2}\\&={\frac {1}{S_{t}}}\,dS_{t}+{\frac {1}{2}}(-S_{t}^{-2})(S_{t}^{2}\sigma ^{2}\,dt)\\&={\frac {1}{S_{t}}}\left(\sigma S_{t}\,dB_{t}+\mu S_{t}\,dt\right)-{\frac {1}{2}}\sigma ^{2}\,dt\\&=\sigma \,dB_{t}+\left(\mu -{\tfrac {\sigma ^{2}}{2}}\right)\,dt.\end{aligned}}}

It follows that

log ( S t ) = log ( S 0 ) + σ B t + ( μ σ 2 2 ) t , {\displaystyle \log(S_{t})=\log(S_{0})+\sigma B_{t}+\left(\mu -{\tfrac {\sigma ^{2}}{2}}\right)t,}

exponentiating gives the expression for S,

S t = S 0 exp ( σ B t + ( μ σ 2 2 ) t ) . {\displaystyle S_{t}=S_{0}\exp \left(\sigma B_{t}+\left(\mu -{\tfrac {\sigma ^{2}}{2}}\right)t\right).}

The correction term of − ⁠σ/2⁠ corresponds to the difference between the median and mean of the log-normal distribution, or equivalently for this distribution, the geometric mean and arithmetic mean, with the median (geometric mean) being lower. This is due to the AM–GM inequality, and corresponds to the logarithm being concave (or convex upwards), so the correction term can accordingly be interpreted as a convexity correction. This is an infinitesimal version of the fact that the annualized return is less than the average return, with the difference proportional to the variance. See geometric moments of the log-normal distribution for further discussion.

The same factor of ⁠σ/2⁠ appears in the d1 and d2 auxiliary variables of the Black–Scholes formula, and can be interpreted as a consequence of Itô's lemma.

Doléans-Dade exponential

The Doléans-Dade exponential (or stochastic exponential) of a continuous semimartingale X can be defined as the solution to the SDE dY = Y dX with initial condition Y0 = 1. It is sometimes denoted by Ɛ(X). Applying Itô's lemma with f(Y) = log(Y) gives

d log ( Y ) = 1 Y d Y 1 2 Y 2 d [ Y ] = d X 1 2 d [ X ] . {\displaystyle {\begin{aligned}d\log(Y)&={\frac {1}{Y}}\,dY-{\frac {1}{2Y^{2}}}\,d\\&=dX-{\tfrac {1}{2}}\,d.\end{aligned}}}

Exponentiating gives the solution

Y t = exp ( X t X 0 1 2 [ X ] t ) . {\displaystyle Y_{t}=\exp \left(X_{t}-X_{0}-{\tfrac {1}{2}}_{t}\right).}

Black–Scholes formula

Itô's lemma can be used to derive the Black–Scholes equation for an option. Suppose a stock price follows a geometric Brownian motion given by the stochastic differential equation dS = S(σdB + μ dt). Then, if the value of an option at time t is f(t, St), Itô's lemma gives

d f ( t , S t ) = ( f t + 1 2 ( S t σ ) 2 2 f S 2 ) d t + f S d S t . {\displaystyle df(t,S_{t})=\left({\frac {\partial f}{\partial t}}+{\frac {1}{2}}\left(S_{t}\sigma \right)^{2}{\frac {\partial ^{2}f}{\partial S^{2}}}\right)\,dt+{\frac {\partial f}{\partial S}}\,dS_{t}.}

The term ⁠∂f/∂SdS represents the change in value in time dt of the trading strategy consisting of holding an amount ⁠∂ f/∂S⁠ of the stock. If this trading strategy is followed, and any cash held is assumed to grow at the risk free rate r, then the total value V of this portfolio satisfies the SDE

d V t = r ( V t f S S t ) d t + f S d S t . {\displaystyle dV_{t}=r\left(V_{t}-{\frac {\partial f}{\partial S}}S_{t}\right)\,dt+{\frac {\partial f}{\partial S}}\,dS_{t}.}

This strategy replicates the option if V = f(t,S). Combining these equations gives the celebrated Black–Scholes equation

f t + σ 2 S 2 2 2 f S 2 + r S f S r f = 0. {\displaystyle {\frac {\partial f}{\partial t}}+{\frac {\sigma ^{2}S^{2}}{2}}{\frac {\partial ^{2}f}{\partial S^{2}}}+rS{\frac {\partial f}{\partial S}}-rf=0.}

Product rule for Itô processes

Let X t {\displaystyle \mathbf {X} _{t}} be a two-dimensional Ito process with SDE:

d X t = d ( X t 1 X t 2 ) = ( μ t 1 μ t 2 ) d t + ( σ t 1 σ t 2 ) d B t {\displaystyle d\mathbf {X} _{t}=d{\begin{pmatrix}X_{t}^{1}\\X_{t}^{2}\end{pmatrix}}={\begin{pmatrix}\mu _{t}^{1}\\\mu _{t}^{2}\end{pmatrix}}dt+{\begin{pmatrix}\sigma _{t}^{1}\\\sigma _{t}^{2}\end{pmatrix}}\,dB_{t}}

Then we can use the multi-dimensional form of Ito's lemma to find an expression for d ( X t 1 X t 2 ) {\displaystyle d(X_{t}^{1}X_{t}^{2})} .

We have μ t = ( μ t 1 μ t 2 ) {\displaystyle \mu _{t}={\begin{pmatrix}\mu _{t}^{1}\\\mu _{t}^{2}\end{pmatrix}}} and G = ( σ t 1 σ t 2 ) {\displaystyle \mathbf {G} ={\begin{pmatrix}\sigma _{t}^{1}\\\sigma _{t}^{2}\end{pmatrix}}} .

We set f ( t , X t ) = X t 1 X t 2 {\displaystyle f(t,\mathbf {X} _{t})=X_{t}^{1}X_{t}^{2}} and observe that f t = 0 ,   ( X f ) T = ( X t 2     X t 1 ) {\displaystyle {\frac {\partial f}{\partial t}}=0,\ (\nabla _{\mathbf {X} }f)^{T}=(X_{t}^{2}\ \ X_{t}^{1})} and H X f = ( 0 1 1 0 ) {\displaystyle H_{\mathbf {X} }f={\begin{pmatrix}0&1\\1&0\end{pmatrix}}}

Substituting these values in the multi-dimensional version of the lemma gives us:

d ( X t 1 X t 2 ) = d f ( t , X t ) = 0 d t + ( X t 2     X t 1 ) d X t + 1 2 ( d X t 1     d X t 2 ) ( 0 1 1 0 ) ( d X t 1 d X t 2 ) = X t 2 d X t 1 + X t 1 d X t 2 + d X t 1 d X t 2 {\displaystyle {\begin{aligned}d(X_{t}^{1}X_{t}^{2})&=df(t,\mathbf {X} _{t})\\&=0\cdot dt+(X_{t}^{2}\ \ X_{t}^{1})\,d\mathbf {X} _{t}+{\frac {1}{2}}(dX_{t}^{1}\ \ dX_{t}^{2}){\begin{pmatrix}0&1\\1&0\end{pmatrix}}{\begin{pmatrix}dX_{t}^{1}\\dX_{t}^{2}\end{pmatrix}}\\&=X_{t}^{2}\,dX_{t}^{1}+X_{t}^{1}dX_{t}^{2}+dX_{t}^{1}\,dX_{t}^{2}\end{aligned}}}

This is a generalisation of Leibniz's product rule to Ito processes, which are non-differentiable.

Further, using the second form of the multidimensional version above gives us

d ( X t 1 X t 2 ) = { 0 + ( X t 2     X t 1 ) ( μ t 1 μ t 2 ) + 1 2 Tr [ ( σ t 1     σ t 2 ) ( 0 1 1 0 ) ( σ t 1 σ t 2 ) ] } d t + ( X t 2 σ t 1 + X t 1 σ t 2 ) d B t = ( X t 2 μ t 1 + X t 1 μ t 2 + σ t 1 σ t 2 ) d t + ( X t 2 σ t 1 + X t 1 σ t 2 ) d B t {\displaystyle {\begin{aligned}d(X_{t}^{1}X_{t}^{2})&=\left\{0+(X_{t}^{2}\ \ X_{t}^{1}){\begin{pmatrix}\mu _{t}^{1}\\\mu _{t}^{2}\end{pmatrix}}+{\frac {1}{2}}\operatorname {Tr} \left\right\}\,dt+(X_{t}^{2}\sigma _{t}^{1}+X_{t}^{1}\sigma _{t}^{2})\,dB_{t}\\&=\left(X_{t}^{2}\mu _{t}^{1}+X_{t}^{1}\mu _{t}^{2}+\sigma _{t}^{1}\sigma _{t}^{2}\right)\,dt+(X_{t}^{2}\sigma _{t}^{1}+X_{t}^{1}\sigma _{t}^{2})\,dB_{t}\end{aligned}}}

so we see that the product X t 1 X t 2 {\displaystyle X_{t}^{1}X_{t}^{2}} is itself an Itô drift-diffusion process.

Itô's formula for functions with finite quadratic variation

Hans Föllmer provided a non-probabilistic proof of the Itô formula and showed that it holds for all functions with finite quadratic variation.

Let f C 2 {\displaystyle f\in C^{2}} be a real-valued function and x : [ 0 , ] R {\displaystyle x:\to \mathbb {R} } a right-continuous function with left limits and finite quadratic variation [ x ] {\displaystyle } . Then

f ( x t ) = f ( x 0 ) + 0 t f ( x s ) d x s + 1 2 ] 0 , t ] f ( x s ) d [ x ] s + 0 s t ( f ( x s ) f ( x s ) f ( x s ) Δ x s 1 2 f ( x s ) ( Δ x s ) 2 ) ) . {\displaystyle {\begin{aligned}f(x_{t})={}&f(x_{0})+\int _{0}^{t}f'(x_{s-})\,\mathrm {d} x_{s}+{\frac {1}{2}}\int _{]0,t]}f''(x_{s-})\,d_{s}\\&+\sum _{0\leq s\leq t}\left(f(x_{s})-f(x_{s-})-f'(x_{s-})\Delta x_{s}-{\frac {1}{2}}f''(x_{s-})(\Delta x_{s})^{2})\right).\end{aligned}}}

where the quadratic variation of $x$ is defined as a limit along a sequence of partitions D n {\displaystyle D_{n}} of [ 0 , t ] {\displaystyle } with step decreasing to zero:

[ x ] ( t ) = lim n t k n D n ( x t k + 1 n x t k n ) 2 . {\displaystyle (t)=\lim _{n\to \infty }\sum _{t_{k}^{n}\in D_{n}}\left(x_{t_{k+1}^{n}}-x_{t_{k}^{n}}\right)^{2}.}

Higher-order Itô formula

Rama Cont and Nicholas Perkowski extended the Ito formula to functions with finite p-th variation:. For a continuous function with finite p-th variation

[ x ] p ( t ) = lim n t k n D n ( x t k + 1 n x t k n ) p {\displaystyle ^{p}(t)=\lim _{n\to \infty }\sum _{t_{k}^{n}\in D_{n}}\left(x_{t_{k+1}^{n}}-x_{t_{k}^{n}}\right)^{p}}

the change of variable formula is:

f ( x t ) = f ( x 0 ) + 0 t p 1 f ( x s ) d x s + 1 p ! ] 0 , t ] f p ( x s ) d [ x ] s p {\displaystyle {\begin{aligned}f(x_{t})={}&f(x_{0})+\int _{0}^{t}\nabla _{p-1}f(x_{s-})\,\mathrm {d} x_{s}+{\frac {1}{p!}}\int _{]0,t]}f^{p}(x_{s-})\,d_{s}^{p}\end{aligned}}}

where the first integral is defined as a limit of compensated left Riemann sums along a sequence of partitions D n {\displaystyle D_{n}} :

0 t p 1 f ( x s ) d x s := t k n D n k = 1 p 1 f k ( x t k n ) k ! ( x t k + 1 n x t k n ) k . {\displaystyle {\begin{aligned}\int _{0}^{t}\nabla _{p-1}f(x_{s-})\,\mathrm {d} x_{s}:={}&\sum _{t_{k}^{n}\in D_{n}}\sum _{k=1}^{p-1}{\frac {f^{k}(x_{t_{k}^{n}})}{k!}}\left(x_{t_{k+1}^{n}}-x_{t_{k}^{n}}\right)^{k}.\end{aligned}}}

Infinite-dimensional formulas

There exist a couple of extensions to infinite-dimensional spaces (e.g. Pardoux, Gyöngy-Krylov, Brzezniak-van Neerven-Veraar-Weis).

See also

Notes

  1. Itô, Kiyoshi (1951). "On a formula concerning stochastic differentials". Nagoya Math. J. 3: 55–65. doi:10.1017/S0027763000012216.
  2. Malliaris, A. G. (1982). Stochastic Methods in Economics and Finance. New York: North-Holland. pp. 220–223. ISBN 0-444-86201-3.
  3. Föllmer, Hans (1981). "Calcul d'Ito sans probabilités". Séminaire de probabilités de Strasbourg. 15: 143–144.
  4. Cont, R.; Perkowski, N. (2019). "Pathwise integration and change of variable formulas for continuous paths with arbitrary regularity". Transactions of the American Mathematical Society. 6: 161–186. arXiv:1803.09269. doi:10.1090/btran/34.
  5. Pardoux, Étienne (1974). "Équations aux dérivées partielles stochastiques de type monotone". Séminaire Jean Leray (3).
  6. Gyöngy, István; Krylov, Nikolay Vladim Vladimirovich (1981). "Ito formula in banach spaces". In M. Arató; D. Vermes, D.; A.V. Balakrishnan (eds.). Stochastic Differential Systems. Lecture Notes in Control and Information Sciences. Vol. 36. Springer, Berlin, Heidelberg. pp. 69–73. doi:10.1007/BFb0006409. ISBN 3-540-11038-0.
  7. Brzezniak, Zdzislaw; van Neerven, Jan M. A. M.; Veraar, Mark C.; Weis, Lutz (2008). "Ito's formula in UMD Banach spaces and regularity of solutions of the Zakai equation". Journal of Differential Equations. 245 (1): 30–58. arXiv:0804.0302. doi:10.1016/j.jde.2008.03.026.

References

  • Kiyosi Itô (1944). Stochastic Integral. Proc. Imperial Acad. Tokyo 20, 519–524. This is the paper with the Ito Formula; Online
  • Kiyosi Itô (1951). On stochastic differential equations. Memoirs, American Mathematical Society 4, 1–51. Online
  • Bernt Øksendal (2000). Stochastic Differential Equations. An Introduction with Applications, 5th edition, corrected 2nd printing. Springer. ISBN 3-540-63720-6. Sections 4.1 and 4.2.
  • Philip E Protter (2005). Stochastic Integration and Differential Equations, 2nd edition. Springer. ISBN 3-662-10061-4. Section 2.7.

External links

Categories: