Inequality in probability theory
In probability theory , Hoeffding's lemma is an inequality that bounds the moment-generating function of any bounded random variable , implying that such variables are subgaussian . It is named after the Finnish –American mathematical statistician Wassily Hoeffding .
The proof of Hoeffding's lemma uses Taylor's theorem and Jensen's inequality . Hoeffding's lemma is itself used in the proof of Hoeffding's inequality as well as the generalization McDiarmid's inequality .
Statement of the lemma
Let X be any real-valued random variable such that
a
≤
X
≤
b
{\displaystyle a\leq X\leq b}
almost surely , i.e. with probability one. Then, for all
λ
∈
R
{\displaystyle \lambda \in \mathbb {R} }
,
E
[
e
λ
X
]
≤
exp
(
λ
E
[
X
]
+
λ
2
(
b
−
a
)
2
8
)
,
{\displaystyle \mathbb {E} \left\leq \exp {\Big (}\lambda \mathbb {E} +{\frac {\lambda ^{2}(b-a)^{2}}{8}}{\Big )},}
or equivalently,
E
[
e
λ
(
X
−
E
[
X
]
)
]
≤
exp
(
λ
2
(
b
−
a
)
2
8
)
.
{\displaystyle \mathbb {E} \left)}\right]\leq \exp {\Big (}{\frac {\lambda ^{2}(b-a)^{2}}{8}}{\Big )}.}
Proof
The following proof is direct but somewhat ad-hoc. Another proof uses exponential tilting ; proofs with a slightly worse constant are also available using symmetrization.
Without loss of generality, by replacing
X
{\displaystyle X}
by
X
−
E
[
X
]
{\displaystyle X-\mathbb {E} }
, we can assume
E
[
X
]
=
0
{\displaystyle \mathbb {E} =0}
, so that
a
≤
0
≤
b
{\displaystyle a\leq 0\leq b}
.
Since
e
λ
x
{\displaystyle e^{\lambda x}}
is a convex function of
x
{\displaystyle x}
, we have that for all
x
∈
[
a
,
b
]
{\displaystyle x\in }
,
e
λ
x
≤
b
−
x
b
−
a
e
λ
a
+
x
−
a
b
−
a
e
λ
b
{\displaystyle e^{\lambda x}\leq {\frac {b-x}{b-a}}e^{\lambda a}+{\frac {x-a}{b-a}}e^{\lambda b}}
So,
E
[
e
λ
X
]
≤
b
−
E
[
X
]
b
−
a
e
λ
a
+
E
[
X
]
−
a
b
−
a
e
λ
b
=
b
b
−
a
e
λ
a
+
−
a
b
−
a
e
λ
b
=
e
L
(
λ
(
b
−
a
)
)
,
{\displaystyle {\begin{aligned}\mathbb {E} \left&\leq {\frac {b-\mathbb {E} }{b-a}}e^{\lambda a}+{\frac {\mathbb {E} -a}{b-a}}e^{\lambda b}\\&={\frac {b}{b-a}}e^{\lambda a}+{\frac {-a}{b-a}}e^{\lambda b}\\&=e^{L(\lambda (b-a))},\end{aligned}}}
where
L
(
h
)
=
h
a
b
−
a
+
ln
(
1
+
a
−
e
h
a
b
−
a
)
{\displaystyle L(h)={\frac {ha}{b-a}}+\ln(1+{\frac {a-e^{h}a}{b-a}})}
. By computing derivatives, we find
L
(
0
)
=
L
′
(
0
)
=
0
{\displaystyle L(0)=L'(0)=0}
and
L
″
(
h
)
=
−
a
b
e
h
(
b
−
a
e
h
)
2
{\displaystyle L''(h)=-{\frac {abe^{h}}{(b-ae^{h})^{2}}}}
.
From the AMGM inequality we thus see that
L
″
(
h
)
≤
1
4
{\displaystyle L''(h)\leq {\frac {1}{4}}}
for all
h
{\displaystyle h}
, and thus, from Taylor's theorem , there is some
0
≤
θ
≤
1
{\displaystyle 0\leq \theta \leq 1}
such that
L
(
h
)
=
L
(
0
)
+
h
L
′
(
0
)
+
1
2
h
2
L
″
(
h
θ
)
≤
1
8
h
2
.
{\displaystyle L(h)=L(0)+hL'(0)+{\frac {1}{2}}h^{2}L''(h\theta )\leq {\frac {1}{8}}h^{2}.}
Thus,
E
[
e
λ
X
]
≤
e
1
8
λ
2
(
b
−
a
)
2
{\displaystyle \mathbb {E} \left\leq e^{{\frac {1}{8}}\lambda ^{2}(b-a)^{2}}}
.
See also
Notes
Pascal Massart (26 April 2007). Concentration Inequalities and Model Selection: Ecole d'Eté de Probabilités de Saint-Flour XXXIII - 2003 . Springer. p. 21. ISBN 978-3-540-48503-2 .
Boucheron, Stéphane; Lugosi, Gábor; Massart, Pascal (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence . Oxford University Press.
Romaní, Marc (1 May 2021). "A short proof of Hoeffding's lemma" . Retrieved 7 September 2024.
Categories :
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.
**DISCLAIMER** We are not affiliated with Wikipedia, and Cloudflare.
The information presented on this site is for general informational purposes only and does not constitute medical advice.
You should always have a personal consultation with a healthcare professional before making changes to your diet, medication, or exercise routine.
AI helps with the correspondence in our chat.
We participate in an affiliate program. If you buy something through a link, we may earn a commission 💕
↑