A Simple Proof for the Chebyshev Inequality: Clearly Explained!

In the Appendix A of this book: Statistics: Principles and Methods written by Giuseppe Cicchitelli, Pierpaolo D’Urso and Marco Minozzo published by Pearson in 2021, I found the most simple proof of the Chebyshev theorem I have ever seen. The idea of using two different sets of element is very smart.

I reproduced the demonstration here:

Let x1, x2, …, xN be a series of observations with mean µ and variance σ2. Let


I_k=[i, 1 \leq i \leq N:\left|x_i-\mu\right|< k\sigma ]


be the set of subscripts i identifying the observations whose deviation from the mean is (in absolute value) less than . Let N(Ik) be the number of elements in Ik. We can write

\begin{aligned}
\sigma^2 &=\dfrac{\sum_{i=1}^N\left(x_i-\mu\right)^2}{N } \\
N \sigma^2 &=\sum_{i=1}^N\left(x_i-\mu\right)^2 \\
&=\sum_{i \in I_k}\left(x_i-\mu\right)^2+\sum_{i \notin I_k}\left(x_i-\mu\right)^2 \\
& \geq \sum_{i \notin I_k}\left(x_i-\mu\right)^2 \\
& \geq \sum_{i \notin I_k} k^2 \sigma^2
\end{aligned}


where the first inequality holds because the sum of squared deviations from the mean extends over the subset of xi not belonging to Ik, while the second holds since (xi-µ)2 >k2 σ2.

This last point can be illustrated by numerical values. Indeed, with µ=0, σ=1, and k=2, we have for (xi-µ) ≤k σ :

For (xi-µ)2 >k2 σ2, we have:

Hence,

\sum_{i \notin I_k} k^2 \sigma^2 \leq N \sigma^2,


from which, by dividing both sides of the inequality by N k2 σ2, we obtain

\frac{1}{N} \sum_{i \notin I_k}(1) \leq \frac{1}{k^2} \Leftrightarrow \frac{N-N\left(I_k\right)}{N} \leq \frac{1}{k^2} .

Recall that the total number of element N is equal to the number of elements in Ik, N(Ik) plus the number of elements that are not in Ik:

N=N\left(I_k\right)+N\left(\text{not }I_k\right)\\
\sum_{i \notin I_k}(1)=N\left(\text{not }I_k\right)=N-N\left(I_k\right)

Finally,

\frac{N\left(I_k\right)}{N} \geq 1-\frac{1}{k^2}

and the result is proved.

1 Comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.