Marchenko–Pastur distribution

(Redirected from Marchenko-Pastur distribution) Distribution of singular values of large rectangular random matrices

In the mathematical theory of random matrices, the Marchenko–Pastur distribution, or Marchenko–Pastur law, describes the asymptotic behavior of singular values of large rectangular random matrices. The theorem is named after soviet mathematicians Volodymyr Marchenko and Leonid Pastur who proved this result in 1967.

If $X$ denotes a $m\times n$ random matrix whose entries are independent identically distributed random variables with mean 0 and variance $\sigma ^{2}<\infty$ , let

Y_{n}={\frac {1}{n}}XX^{T}

and let $\lambda _{1},\,\lambda _{2},\,\dots ,\,\lambda _{m}$ be the eigenvalues of $Y_{n}$ (viewed as random variables). Finally, consider the random measure

\mu _{m}(A)={\frac {1}{m}}\#\left\{\lambda _{j}\in A\right\},\quad A\subset \mathbb {R} .

counting the number of eigenvalues in the subset $A$ included in $\mathbb {R}$ .

Theorem. Assume that $m,\,n\,\to \,\infty$ so that the ratio $m/n\,\to \,\lambda \in (0,+\infty )$ . Then $\mu _{m}\,\to \,\mu$ (in weak* topology in distribution), where

\mu (A)={\begin{cases}(1-{\frac {1}{\lambda }})\mathbf {1} _{0\in A}+\nu (A),&{\text{if }}\lambda >1\\\nu (A),&{\text{if }}0\leq \lambda \leq 1,\end{cases}}

and

d\nu (x)={\frac {1}{2\pi \sigma ^{2}}}{\frac {\sqrt {(\lambda _{+}-x)(x-\lambda _{-})}}{\lambda x}}\,\mathbf {1} _{x\in }\,dx

with

\lambda _{\pm }=\sigma ^{2}(1\pm {\sqrt {\lambda }})^{2}.

The Marchenko–Pastur law also arises as the free Poisson law in free probability theory, having rate $1/\lambda$ and jump size $\sigma ^{2}$ .

Moments

For each $k\geq 1$ , its $k$ -th moment is

\sum _{r=0}^{k-1}{\frac {\sigma ^{2k}}{r+1}}{\binom {k}{r}}{\binom {k-1}{r}}\lambda ^{r}={\frac {\sigma ^{2k}}{k}}\sum _{r=0}^{k-1}{\binom {k}{r}}{\binom {k}{r+1}}\lambda ^{r}

Some transforms of this law

The Stieltjes transform is given by

s(z)={\frac {\sigma ^{2}(1-\lambda )-z-{\sqrt {(z-\sigma ^{2}(\lambda +1))^{2}-4\lambda \sigma ^{4}}}}{2\lambda z\sigma ^{2}}}

for complex numbers z of positive imaginary part, where the complex square root is also taken to have positive imaginary part. The Stieltjes transform can be repackaged in the form of the R-transform, which is given by

R(z)={\frac {\sigma ^{2}}{1-\sigma ^{2}\lambda z}}

The S-transform is given by

S(z)={\frac {1}{\sigma ^{2}(1+\lambda z)}}.

For the case of $\sigma =1$ , the $\eta$ -transform is given by $\mathbb {E} {\frac {1}{1+\gamma X}}$ where $X$ satisfies the Marchenko-Pastur law.

\eta (\gamma )=1-{\frac {{\mathcal {F}}(\gamma ,\lambda )}{4\gamma \lambda }}

where ${\mathcal {F}}(x,z)=\left({\sqrt {x(1+{\sqrt {z}})^{2}+1}}-{\sqrt {x(1-{\sqrt {z}})^{2}+1}}\right)^{2}$

For exact analyis of high dimensional regression in the proportional asymptotic regime, a convenient form is often $T(u):=\eta \left({\tfrac {1}{u}}\right)$ which simplifies to

T(u)={\frac {-1+\lambda -u+{\sqrt {(1+u-\lambda )^{2}+4u\lambda }}}{2\lambda }}

The following functions $B(u):=\mathbb {E} \left({\frac {u}{X+u}}\right)^{2}$ and $V(u):={\frac {X}{(X+u)^{2}}}$ , where $X$ satisfies the Marchenko-Pastur law, show up in the limiting Bias and Variance respectively, of ridge regression and other regularized linear regression problems. One can show that $B(u)=T(u)-u\cdot T'(u)$ and $V(u)=T'(u)$ .

Application to correlation matrices

For the special case of correlation matrices, we know that $\sigma ^{2}=1$ and $\lambda =m/n$ . This bounds the probability mass over the interval defined by

\lambda _{\pm }=\left(1\pm {\sqrt {\frac {m}{n}}}\right)^{2}.

Since this distribution describes the spectrum of random matrices with mean 0, the eigenvalues of correlation matrices that fall inside of the aforementioned interval could be considered spurious or noise. For instance, obtaining a correlation matrix of 10 stock returns calculated over a 252 trading days period would render $\lambda _{+}=\left(1+{\sqrt {\frac {10}{252}}}\right)^{2}\approx 1.43$ . Thus, out of 10 eigenvalues of said correlation matrix, only the values higher than 1.43 would be considered significantly different from random.

References

Bai & Silverstein 2010, Section 3.1.1.
Bai & Silverstein 2010, Section 3.3.1.
^ Tulino & Verdú 2004, Section 2.2.

Bai, Zhidong; Silverstein, Jack W. (2010). Spectral analysis of large dimensional random matrices. Springer Series in Statistics (Second edition of 2006 original ed.). New York: Springer. doi:10.1007/978-1-4419-0661-8. ISBN 978-1-4419-0660-1. MR 2567175. Zbl 1301.60002.
Epps, Brenden; Krivitzky, Eric M. (2019). "Singular value decomposition of noisy data: mode corruption". Experiments in Fluids. 60 (8): 1–30. Bibcode:2019ExFl...60..121E. doi:10.1007/s00348-019-2761-y. S2CID 198436243.
Götze, F.; Tikhomirov, A. (2004). "Rate of convergence in probability to the Marchenko–Pastur law". Bernoulli. 10 (3): 503–548. doi:10.3150/bj/1089206408.
Marchenko, V. A.; Pastur, L. A. (1967). "Распределение собственных значений в некоторых ансамблях случайных матриц" [Distribution of eigenvalues for some sets of random matrices]. Mat. Sb. N.S. (in Russian). 72 (114:4): 507–536. Bibcode:1967SbMat...1..457M. doi:10.1070/SM1967v001n04ABEH001994. Link to free-access pdf of Russian version
Nica, A.; Speicher, R. (2006). Lectures on the Combinatorics of Free probability theory. Cambridge Univ. Press. pp. 204, 368. ISBN 0-521-85852-6. Link to free download Another free access site
Tulino, Antonia M.; Verdú, Sergio (2004). "Random matrix theory and wireless communications". Foundations and Trends in Communications and Information Theory. 1 (1): 1–182. doi:10.1561/0100000001. Zbl 1143.94303.
Zhang, W.; Abreu, G.; Inamori, M.; Sanada, Y. (2011). "Spectrum sensing algorithms via finite random matrices". IEEE Transactions on Communications. 60 (1): 164–175. doi:10.1109/TCOMM.2011.112311.100721. S2CID 206642535.

Probability distributions (list)

Discrete
univariate

with finite support	Benford Bernoulli Beta-binomial Binomial Categorical Hypergeometric Negative Poisson binomial Rademacher Soliton Discrete uniform Zipf Zipf–Mandelbrot
with infinite support	Beta negative binomial Borel Conway–Maxwell–Poisson Discrete phase-type Delaporte Extended negative binomial Flory–Schulz Gauss–Kuzmin Geometric Logarithmic Mixed Poisson Negative binomial Panjer Parabolic fractal Poisson Skellam Yule–Simon Zeta

Continuous
univariate

supported on a bounded interval	Arcsine ARGUS Balding–Nichols Bates Beta Generalized Beta rectangular Continuous Bernoulli Irwin–Hall Kumaraswamy Logit-normal Noncentral beta PERT Raised cosine Reciprocal Triangular U-quadratic Uniform Wigner semicircle
supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind Beta prime Burr Chi Chi-squared Noncentral Inverse Scaled Dagum Davis Erlang Hyper Exponential Hyperexponential Hypoexponential Logarithmic F Noncentral Folded normal Fréchet Gamma Generalized Inverse gamma/Gompertz Gompertz Shifted Half-logistic Half-normal Hotelling's T-squared Inverse Gaussian Generalized Kolmogorov Lévy Log-Cauchy Log-Laplace Log-logistic Log-normal Log-t Lomax Matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami Pareto Phase-type Poly-Weibull Rayleigh Relativistic Breit–Wigner Rice Truncated normal type-2 Gumbel Weibull Discrete Wilks's lambda
supported on the whole real line	Cauchy Exponential power Fisher's z Kaniadakis κ-Gaussian Gaussian q Generalized normal Generalized hyperbolic Geometric stable Gumbel Holtsmark Hyperbolic secant Johnson's S_U Landau Laplace Asymmetric Logistic Noncentral t Normal (Gaussian) Normal-inverse Gaussian Skew normal Slash Stable Student's t Tracy–Widom Variance-gamma Voigt
with support whose type varies	Generalized chi-squared Generalized extreme value Generalized Pareto Marchenko–Pastur Kaniadakis κ-exponential Kaniadakis κ-Gamma Kaniadakis κ-Weibull Kaniadakis κ-Logistic Kaniadakis κ-Erlang q-exponential q-Gaussian q-Weibull Shifted log-logistic Tukey lambda