Misplaced Pages

Taylor expansions for the moments of functions of random variables

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Taylor expansions for the moments of functions of random variables" – news · newspapers · books · scholar · JSTOR (November 2014) (Learn how and when to remove this message)
This article may be too technical for most readers to understand. Please help improve it to make it understandable to non-experts, without removing the technical details. (December 2021) (Learn how and when to remove this message)
(Learn how and when to remove this message)

In probability theory, it is possible to approximate the moments of a function f of a random variable X using Taylor expansions, provided that f is sufficiently differentiable and that the moments of X are finite.


A simulation-based alternative to this approximation is the application of Monte Carlo simulations.

First moment

Given μ X {\displaystyle \mu _{X}} and σ X 2 {\displaystyle \sigma _{X}^{2}} , the mean and the variance of X {\displaystyle X} , respectively, a Taylor expansion of the expected value of f ( X ) {\displaystyle f(X)} can be found via

E [ f ( X ) ] = E [ f ( μ X + ( X μ X ) ) ] E [ f ( μ X ) + f ( μ X ) ( X μ X ) + 1 2 f ( μ X ) ( X μ X ) 2 ] = f ( μ X ) + f ( μ X ) E [ X μ X ] + 1 2 f ( μ X ) E [ ( X μ X ) 2 ] . {\displaystyle {\begin{aligned}\operatorname {E} \left&{}=\operatorname {E} \left\\&{}\approx \operatorname {E} \left\\&{}=f(\mu _{X})+f'(\mu _{X})\operatorname {E} \left+{\frac {1}{2}}f''(\mu _{X})\operatorname {E} \left.\end{aligned}}}

Since E [ X μ X ] = 0 , {\displaystyle E=0,} the second term vanishes. Also, E [ ( X μ X ) 2 ] {\displaystyle E} is σ X 2 {\displaystyle \sigma _{X}^{2}} . Therefore,

E [ f ( X ) ] f ( μ X ) + f ( μ X ) 2 σ X 2 {\displaystyle \operatorname {E} \left\approx f(\mu _{X})+{\frac {f''(\mu _{X})}{2}}\sigma _{X}^{2}} .

It is possible to generalize this to functions of more than one variable using multivariate Taylor expansions. For example,

E [ X Y ] E [ X ] E [ Y ] cov [ X , Y ] E [ Y ] 2 + E [ X ] E [ Y ] 3 var [ Y ] {\displaystyle \operatorname {E} \left\approx {\frac {\operatorname {E} \left}{\operatorname {E} \left}}-{\frac {\operatorname {cov} \left}{\operatorname {E} \left^{2}}}+{\frac {\operatorname {E} \left}{\operatorname {E} \left^{3}}}\operatorname {var} \left}

Second moment

Similarly,

var [ f ( X ) ] ( f ( E [ X ] ) ) 2 var [ X ] = ( f ( μ X ) ) 2 σ X 2 1 4 ( f ( μ X ) ) 2 σ X 4 {\displaystyle \operatorname {var} \left\approx \left(f'(\operatorname {E} \left)\right)^{2}\operatorname {var} \left=\left(f'(\mu _{X})\right)^{2}\sigma _{X}^{2}-{\frac {1}{4}}\left(f''(\mu _{X})\right)^{2}\sigma _{X}^{4}}

The above is obtained using a second order approximation, following the method used in estimating the first moment. It will be a poor approximation in cases where f ( X ) {\displaystyle f(X)} is highly non-linear. This is a special case of the delta method.

Indeed, we take E [ f ( X ) ] f ( μ X ) + f ( μ X ) 2 σ X 2 {\displaystyle \operatorname {E} \left\approx f(\mu _{X})+{\frac {f''(\mu _{X})}{2}}\sigma _{X}^{2}} .

With f ( X ) = g ( X ) 2 {\displaystyle f(X)=g(X)^{2}} , we get E [ Y 2 ] {\displaystyle \operatorname {E} \left} . The variance is then computed using the formula var [ Y ] = E [ Y 2 ] μ Y 2 {\displaystyle \operatorname {var} \left=\operatorname {E} \left-\mu _{Y}^{2}} .

An example is,

var [ X Y ] var [ X ] E [ Y ] 2 2 E [ X ] E [ Y ] 3 cov [ X , Y ] + E [ X ] 2 E [ Y ] 4 var [ Y ] . {\displaystyle \operatorname {var} \left\approx {\frac {\operatorname {var} \left}{\operatorname {E} \left^{2}}}-{\frac {2\operatorname {E} \left}{\operatorname {E} \left^{3}}}\operatorname {cov} \left+{\frac {\operatorname {E} \left^{2}}{\operatorname {E} \left^{4}}}\operatorname {var} \left.}

The second order approximation, when X follows a normal distribution, is:

var [ f ( X ) ] ( f ( E [ X ] ) ) 2 var [ X ] + ( f ( E [ X ] ) ) 2 2 ( var [ X ] ) 2 = ( f ( μ X ) ) 2 σ X 2 + 1 2 ( f ( μ X ) ) 2 σ X 4 + ( f ( μ X ) ) ( f ( μ X ) ) σ X 4 {\displaystyle \operatorname {var} \left\approx \left(f'(\operatorname {E} \left)\right)^{2}\operatorname {var} \left+{\frac {\left(f''(\operatorname {E} \left)\right)^{2}}{2}}\left(\operatorname {var} \left\right)^{2}=\left(f'(\mu _{X})\right)^{2}\sigma _{X}^{2}+{\frac {1}{2}}\left(f''(\mu _{X})\right)^{2}\sigma _{X}^{4}+\left(f'(\mu _{X})\right)\left(f'''(\mu _{X})\right)\sigma _{X}^{4}}

First product moment

To find a second-order approximation for the covariance of functions of two random variables (with the same function applied to both), one can proceed as follows. First, note that cov [ f ( X ) , f ( Y ) ] = E [ f ( X ) f ( Y ) ] E [ f ( X ) ] E [ f ( Y ) ] {\displaystyle \operatorname {cov} \left=\operatorname {E} \left-\operatorname {E} \left\operatorname {E} \left} . Since a second-order expansion for E [ f ( X ) ] {\displaystyle \operatorname {E} \left} has already been derived above, it only remains to find E [ f ( X ) f ( Y ) ] {\displaystyle \operatorname {E} \left} . Treating f ( X ) f ( Y ) {\displaystyle f(X)f(Y)} as a two-variable function, the second-order Taylor expansion is as follows:

f ( X ) f ( Y ) f ( μ X ) f ( μ Y ) + ( X μ X ) f ( μ X ) f ( μ Y ) + ( Y μ Y ) f ( μ X ) f ( μ Y ) + 1 2 [ ( X μ X ) 2 f ( μ X ) f ( μ Y ) + 2 ( X μ X ) ( Y μ Y ) f ( μ X ) f ( μ Y ) + ( Y μ Y ) 2 f ( μ X ) f ( μ Y ) ] {\displaystyle {\begin{aligned}f(X)f(Y)&{}\approx f(\mu _{X})f(\mu _{Y})+(X-\mu _{X})f'(\mu _{X})f(\mu _{Y})+(Y-\mu _{Y})f(\mu _{X})f'(\mu _{Y})+{\frac {1}{2}}\left\end{aligned}}}

Taking expectation of the above and simplifying—making use of the identities E ( X 2 ) = var ( X ) + [ E ( X ) ] 2 {\displaystyle \operatorname {E} (X^{2})=\operatorname {var} (X)+\left^{2}} and E ( X Y ) = cov ( X , Y ) + [ E ( X ) ] [ E ( Y ) ] {\displaystyle \operatorname {E} (XY)=\operatorname {cov} (X,Y)+\left\left} —leads to E [ f ( X ) f ( Y ) ] f ( μ X ) f ( μ Y ) + f ( μ X ) f ( μ Y ) cov ( X , Y ) + 1 2 f ( μ X ) f ( μ Y ) var ( X ) + 1 2 f ( μ X ) f ( μ Y ) var ( Y ) {\displaystyle \operatorname {E} \left\approx f(\mu _{X})f(\mu _{Y})+f'(\mu _{X})f'(\mu _{Y})\operatorname {cov} (X,Y)+{\frac {1}{2}}f''(\mu _{X})f(\mu _{Y})\operatorname {var} (X)+{\frac {1}{2}}f(\mu _{X})f''(\mu _{Y})\operatorname {var} (Y)} . Hence,

cov [ f ( X ) , f ( Y ) ] f ( μ X ) f ( μ Y ) + f ( μ X ) f ( μ Y ) cov ( X , Y ) + 1 2 f ( μ X ) f ( μ Y ) var ( X ) + 1 2 f ( μ X ) f ( μ Y ) var ( Y ) [ f ( μ X ) + 1 2 f ( μ X ) var ( X ) ] [ f ( μ Y ) + 1 2 f ( μ Y ) var ( Y ) ] = f ( μ X ) f ( μ Y ) cov ( X , Y ) 1 4 f ( μ X ) f ( μ Y ) var ( X ) var ( Y ) {\displaystyle {\begin{aligned}\operatorname {cov} \left&{}\approx f(\mu _{X})f(\mu _{Y})+f'(\mu _{X})f'(\mu _{Y})\operatorname {cov} (X,Y)+{\frac {1}{2}}f''(\mu _{X})f(\mu _{Y})\operatorname {var} (X)+{\frac {1}{2}}f(\mu _{X})f''(\mu _{Y})\operatorname {var} (Y)-\left\left\\&{}=f'(\mu _{X})f'(\mu _{Y})\operatorname {cov} (X,Y)-{\frac {1}{4}}f''(\mu _{X})f''(\mu _{Y})\operatorname {var} (X)\operatorname {var} (Y)\end{aligned}}}

Random vectors

If X is a random vector, the approximations for the mean and variance of f ( X ) {\displaystyle f(X)} are given by

E ( f ( X ) ) = f ( μ X ) + 1 2 trace ( H f ( μ X ) Σ X ) var ( f ( X ) ) = f ( μ X ) t Σ X f ( μ X ) + 1 2 trace ( H f ( μ X ) Σ X H f ( μ X ) Σ X ) . {\displaystyle {\begin{aligned}\operatorname {E} (f(X))&=f(\mu _{X})+{\frac {1}{2}}\operatorname {trace} (H_{f}(\mu _{X})\Sigma _{X})\\\operatorname {var} (f(X))&=\nabla f(\mu _{X})^{t}\Sigma _{X}\nabla f(\mu _{X})+{\frac {1}{2}}\operatorname {trace} \left(H_{f}(\mu _{X})\Sigma _{X}H_{f}(\mu _{X})\Sigma _{X}\right).\end{aligned}}}

Here f {\displaystyle \nabla f} and H f {\displaystyle H_{f}} denote the gradient and the Hessian matrix respectively, and Σ X {\displaystyle \Sigma _{X}} is the covariance matrix of X.

See also

Notes

  1. ^ Haym Benaroya, Seon Mi Han, and Mark Nagurka. Probability Models in Engineering and Science. CRC Press, 2005, p166.
  2. ^ van Kempen, G.m.p.; van Vliet, L.j. (1 April 2000). "Mean and Variance of Ratio Estimators Used in Fluorescence Ratio Imaging". Cytometry. 39 (4): 300–305. doi:10.1002/(SICI)1097-0320(20000401)39:4<300::AID-CYTO8>3.0.CO;2-O. Retrieved 2024-08-14.
  3. Hendeby, Gustaf; Gustafsson, Fredrik. "ON NONLINEAR TRANSFORMATIONS OF GAUSSIAN DISTRIBUTIONS" (PDF). Retrieved 5 October 2017.
  4. Rego, Bruno V.; Weiss, Dar; Bersi, Matthew R.; Humphrey, Jay D. (14 December 2021). "Uncertainty quantification in subject‐specific estimation of local vessel mechanical properties". International Journal for Numerical Methods in Biomedical Engineering. 37 (12): e3535. doi:10.1002/cnm.3535. ISSN 2040-7939. PMC 9019846. PMID 34605615.

Further reading

Categories: