In the Appendix A of this book: Statistics: Principles and Methods written by Giuseppe Cicchitelli, Pierpaolo D’Urso and Marco Minozzo published by Pearson in 2021, I found the most simple proof of the Chebyshev theorem I have ever seen. The idea of using two different sets of element is very smart.
I reproduced the demonstration here:
Let x1, x2, …, xN be a series of observations with mean µ and variance σ2. Let
I_k=[i, 1 \leq i \leq N:\left|x_i-\mu\right|< k\sigma ]
be the set of subscripts i identifying the observations whose deviation from the mean is (in absolute value) less than kσ. Let N(Ik) be the number of elements in Ik. We can write
\begin{aligned} \sigma^2 &=\dfrac{\sum_{i=1}^N\left(x_i-\mu\right)^2}{N } \\ N \sigma^2 &=\sum_{i=1}^N\left(x_i-\mu\right)^2 \\ &=\sum_{i \in I_k}\left(x_i-\mu\right)^2+\sum_{i \notin I_k}\left(x_i-\mu\right)^2 \\ & \geq \sum_{i \notin I_k}\left(x_i-\mu\right)^2 \\ & \geq \sum_{i \notin I_k} k^2 \sigma^2 \end{aligned}
where the first inequality holds because the sum of squared deviations from the mean extends over the subset of xi not belonging to Ik, while the second holds since (xi-µ)2 >k2 σ2.
This last point can be illustrated by numerical values. Indeed, with µ=0, σ=1, and k=2, we have for (xi-µ) ≤k σ :
For (xi-µ)2 >k2 σ2, we have:
Hence,
\sum_{i \notin I_k} k^2 \sigma^2 \leq N \sigma^2,
from which, by dividing both sides of the inequality by N k2 σ2, we obtain
\frac{1}{N} \sum_{i \notin I_k}(1) \leq \frac{1}{k^2} \Leftrightarrow \frac{N-N\left(I_k\right)}{N} \leq \frac{1}{k^2} .
Recall that the total number of element N is equal to the number of elements in Ik, N(Ik) plus the number of elements that are not in Ik:
N=N\left(I_k\right)+N\left(\text{not }I_k\right)\\ \sum_{i \notin I_k}(1)=N\left(\text{not }I_k\right)=N-N\left(I_k\right)
Finally,
\frac{N\left(I_k\right)}{N} \geq 1-\frac{1}{k^2}
and the result is proved.
1 Comment
[…] A Simple Proof for the Chebyshev Inequality: Clearly Explained! […]