Difference between revisions of "Central limit theorem by example"

From MathTank
Jump to navigation Jump to search
m
m
 
(20 intermediate revisions by the same user not shown)
Line 1: Line 1:
The Central Limit Theorem (CLT for short) is one of the most fundamental results in Probability and Statistics, that provides numerous applications and, to some extent, "explains" ubiquity of normal distribution. Below is one of the versions of this theorem:
+
The Central Limit Theorem is one of the most fundamental results in Probability and Statistics, that provides numerous applications and, to some extent, "explains" ubiquity of normal distribution. Below is one of the versions of this theorem:
  
 
Let <math>Y_1, Y_2,\dots ,Y_n, \dots </math>  be a sequence of independent identically distributed random variables with mean <math>\mu </math> and variance <math>\sigma ^2 </math>. Let <math>\overline{Y}_n = (Y_1+Y_2+\dots +Y_n)/n </math> and
 
Let <math>Y_1, Y_2,\dots ,Y_n, \dots </math>  be a sequence of independent identically distributed random variables with mean <math>\mu </math> and variance <math>\sigma ^2 </math>. Let <math>\overline{Y}_n = (Y_1+Y_2+\dots +Y_n)/n </math> and
Line 61: Line 61:
 
\end{cases}
 
\end{cases}
 
</math>
 
</math>
 
+
[[File:Exp1.png|alt=Exponential density|thumb|Exponential density|none]]
 
<math> f_2(y) =  
 
<math> f_2(y) =  
 
\begin{cases}  
 
\begin{cases}  
Line 68: Line 68:
 
\end{cases}
 
\end{cases}
 
</math>
 
</math>
 
+
[[File:Exp2.png|alt=Density f2|thumb|Density of Y1+Y2|none]]
 
<math> f_3(y) =  
 
<math> f_3(y) =  
 
\begin{cases}  
 
\begin{cases}  
Line 75: Line 75:
 
\end{cases}
 
\end{cases}
 
</math>
 
</math>
 
+
[[File:Exp3.png|alt=Density f3|thumb|Density of Y1+Y2+Y3|none]]
 
<math> f_4(y) =  
 
<math> f_4(y) =  
 
\begin{cases}  
 
\begin{cases}  
Line 82: Line 82:
 
\end{cases}
 
\end{cases}
 
</math>
 
</math>
 
+
[[File:Exp4.png|alt=Density f4|thumb|Density of Y1+Y2+Y3+Y4|none]]
 
<math> f_5(y) =  
 
<math> f_5(y) =  
 
\begin{cases}  
 
\begin{cases}  
Line 89: Line 89:
 
\end{cases}
 
\end{cases}
 
</math>
 
</math>
 +
[[File:Exp5.png|alt=Density f5|thumb|Density of Y1+Y2+Y3+Y4+Y5|none]]
  
Note: it can be shown by induction that
+
Note: it can be shown that
  
 
<math> f_n(y) =  
 
<math> f_n(y) =  
Line 98: Line 99:
 
\end{cases}
 
\end{cases}
 
</math>
 
</math>
 +
 +
(use induction or the fact that sum of independent identically distributed exponential random variables has Gamma distribution; the latter can be shown using moment-generating functions).
 +
  
 
'''Example: sums of uniformly distributed random variables'''
 
'''Example: sums of uniformly distributed random variables'''
Line 109: Line 113:
 
Computing these convolutions (either directly or using software) we get:
 
Computing these convolutions (either directly or using software) we get:
  
<math> f_2(y) =  
+
<math> f_1(y) =  
 
\begin{cases}  
 
\begin{cases}  
 
y,  & 0\le y \le 1 \\
 
y,  & 0\le y \le 1 \\
2-y, & 1\le y \le 2\\
 
 
0, &\mbox{ otherwise}
 
0, &\mbox{ otherwise}
 
\end{cases}
 
\end{cases}
 
</math>
 
</math>
  
<math> f_3(y) =  
+
[[File:uni1.png|alt=Uniform density|thumb|Uniform density|none]]
 +
 
 +
<math> f_2(y) =  
 
\begin{cases}  
 
\begin{cases}  
 
y,  & 0\le y \le 1 \\
 
y,  & 0\le y \le 1 \\
Line 125: Line 130:
 
</math>
 
</math>
  
<math> f_4(y) =  
+
[[File:uni2.png|alt=Density f2|thumb|Density of Y1+Y2|none]]
 +
 
 +
<math> f_3(y) =  
 
\begin{cases}  
 
\begin{cases}  
y& 0\le y \le 1 \\
+
\frac{y^{2}}{2} & 0\le y \le 1 \\
2-y, & 1\le y \le 2\\
+
- y^{2} + 3 x - \frac{3}{2} & 1\le y \le 2 \\
0, &\mbox{ otherwise}
+
\frac{y^{2}}{2} - 3 y + \frac{9}{2} & 2\le y \le 3\\
 +
0 &\mbox{ otherwise}  
 
\end{cases}
 
\end{cases}
 +
 
</math>
 
</math>
 +
 +
[[File:uni3.png|alt=Density f3|thumb|Density of Y1+Y2+Y3|none]]
 +
 +
<math> f_4(y)
 +
</math>
 +
 +
[[File:uni4.png|alt=Density f4|thumb|Density of Y1+Y2+Y3+Y4|none]]

Latest revision as of 21:11, 16 December 2021

The Central Limit Theorem is one of the most fundamental results in Probability and Statistics, that provides numerous applications and, to some extent, "explains" ubiquity of normal distribution. Below is one of the versions of this theorem:

Let [math]\displaystyle{ Y_1, Y_2,\dots ,Y_n, \dots }[/math] be a sequence of independent identically distributed random variables with mean [math]\displaystyle{ \mu }[/math] and variance [math]\displaystyle{ \sigma ^2 }[/math]. Let [math]\displaystyle{ \overline{Y}_n = (Y_1+Y_2+\dots +Y_n)/n }[/math] and

[math]\displaystyle{ X_n = \frac{\overline{Y}_n-\mu}{\sigma/\sqrt{n}}. }[/math]

Then [math]\displaystyle{ \{ X_n\}_{n=1}^\infty }[/math] converges in distribution to the standard normal random variable, i.e.

[math]\displaystyle{ \lim _{n\to\infty} P(X_n\le x) = \int_{-\infty}^x \frac{1}{\sqrt{2\pi}}e^{-t^2/2}\,dt }[/math]

for all [math]\displaystyle{ x }[/math].

While the proof of this theorem is often beyond the scope of introductory undergraduate probability and statistics courses, there are several "convincing" examples that make the statement of the theorem very plausible. Below we provide two such example.

Bernoulli trials and Binomial distribution

Let [math]\displaystyle{ Y_1,Y_2,\dots , Y_n,\dots }[/math] be random variables representing Bernoulli trials, i.e. [math]\displaystyle{ P(Y_n=1)=p }[/math] and [math]\displaystyle{ P(Y_n=0)=1-p }[/math] for all [math]\displaystyle{ n }[/math]. Then [math]\displaystyle{ X_n= Y_1+Y_2+\dots +Y_n }[/math] has Binomial distributions with parameters [math]\displaystyle{ p }[/math] and [math]\displaystyle{ n }[/math]. A concrete examples here would be rolling a die repeatedly, with success being, say, rolling a 1. For smaller [math]\displaystyle{ n }[/math] (e.g. [math]\displaystyle{ n= 10 }[/math]) the Binomial histogram is not symmetric. However, for larger [math]\displaystyle{ n }[/math] the histogram of the distribution of [math]\displaystyle{ X_n }[/math] resembles the normal density curve.

Convolution

Recall that the convolution of two functions [math]\displaystyle{ f \text{ and } g }[/math] is defined by [math]\displaystyle{ (f*g)(x) = \int_{-\infty}^\infty f(t) g(x-t)\, dt }[/math] and that the convolution has the following properties:

Commutativity: [math]\displaystyle{ f*g = g*f }[/math]

Associativity: [math]\displaystyle{ (f*g)*h = f*(g*h) }[/math]

Distributivity: [math]\displaystyle{ f*(ag+bh)= a(f*g)+b(f*h) }[/math]

Differentiation: [math]\displaystyle{ (f*g)' = (f')*g=f*(g') }[/math]

We will prove first that if [math]\displaystyle{ Y_1 }[/math] and [math]\displaystyle{ Y_2 }[/math] are independent random variables with densities [math]\displaystyle{ f_1 }[/math] and [math]\displaystyle{ f_2 }[/math] then the density of their sum [math]\displaystyle{ Y_1+Y_2 }[/math] is the convolution [math]\displaystyle{ f_1*f_2 }[/math].

Let [math]\displaystyle{ F, F_1, \text{ and }F_2 }[/math] denote the cumulative distribution functions of [math]\displaystyle{ Y_1+Y_2, Y_1, \text{ and }Y_2, }[/math] respectively. Let [math]\displaystyle{ f }[/math] denote the density of [math]\displaystyle{ Y_1+Y_2 }[/math]. Note that [math]\displaystyle{ f_1(y_1)f_2(y_2) }[/math] is the joint density of [math]\displaystyle{ (Y_1,Y_2). }[/math] For all [math]\displaystyle{ y }[/math] we have:

[math]\displaystyle{ \int_{-\infty} ^y f(t)\, dt = F(y) = P(Y_1+Y_2\le y) = \int_{-\infty}^{\infty} \int_{-\infty}^{y-y_1} f_1(y_1)f_2(y_2) \,dy_2dy_1 = \int_{-\infty}^{\infty} f_1(y_1) \int_{-\infty}^{y-y_1} f_2(y_2) \,dy_2dy_1 = \int_{-\infty}^{\infty} f_1(y_1) F_2(y-y_1) dy_1 . }[/math]

Summarizing, and replacing [math]\displaystyle{ y_1 }[/math] with [math]\displaystyle{ t }[/math], for all [math]\displaystyle{ y }[/math] we get:

[math]\displaystyle{ F(y) = \int_{-\infty}^{\infty} f_1(t) F_2(y-t) dt }[/math]

Taking derivative with respect to [math]\displaystyle{ y }[/math] we get:

[math]\displaystyle{ f(y) = \frac{dF(y)}{dy} = \frac{d}{dy} \int_{-\infty}^{\infty} f_1(t) F_2(y-t) dt = \int_{-\infty}^{\infty} f_1(t) \frac{dF_2(y-t)}{dy} dt = \int_{-\infty}^{\infty} f_1(t) f_2(y-t)dt =(f_1*f_2)(y), }[/math]

as required.

Example: sums of uniformly distributed random variables

Let [math]\displaystyle{ Y_1, Y_2, \dots , Y_n,\dots }[/math] be independent random variables having exponential distribution with mean 1. Let [math]\displaystyle{ X_n = Y_1+Y_2+\dots +Y_n }[/math] for all [math]\displaystyle{ n=1, 2,\dots }[/math]. We will find the densities of [math]\displaystyle{ X_2, X_3, X_4 }[/math] and graph them.

Note that each [math]\displaystyle{ Y_n }[/math] has the density [math]\displaystyle{ f(y) =f_1(y) = e^{-y} }[/math] for [math]\displaystyle{ y \ge 0 }[/math] (and [math]\displaystyle{ 0 }[/math] for [math]\displaystyle{ y \lt 0 }[/math]). Further, the density [math]\displaystyle{ f_n }[/math] of [math]\displaystyle{ X_n }[/math] is

[math]\displaystyle{ f_n (y) = (f*f*\dots *f) (y) }[/math] ([math]\displaystyle{ n }[/math] -fold convolution).

Computing these convolutions (either directly or using software) we get:

[math]\displaystyle{ f_1(y) = \begin{cases} e^{-y}, & y\gt 0 \\ 0, &\mbox{ otherwise} \end{cases} }[/math]

Exponential density
Exponential density

[math]\displaystyle{ f_2(y) = \begin{cases} \frac{1}{2}y^{2} e^{- y}, & y\gt 0 \\ 0, &\mbox{ otherwise} \end{cases} }[/math]

Density f2
Density of Y1+Y2

[math]\displaystyle{ f_3(y) = \begin{cases} \frac{1}{6}y^{3} e^{- y}, & y\gt 0 \\ 0, &\mbox{ otherwise} \end{cases} }[/math]

Density f3
Density of Y1+Y2+Y3

[math]\displaystyle{ f_4(y) = \begin{cases} \frac{1}{24}y^{4} e^{- y}, & y\gt 0 \\ 0, &\mbox{ otherwise} \end{cases} }[/math]

Density f4
Density of Y1+Y2+Y3+Y4

[math]\displaystyle{ f_5(y) = \begin{cases} \frac{1}{120}y^{5} e^{- y}, & y\gt 0 \\ 0, &\mbox{ otherwise} \end{cases} }[/math]

Density f5
Density of Y1+Y2+Y3+Y4+Y5

Note: it can be shown that

[math]\displaystyle{ f_n(y) = \begin{cases} \frac{1}{n!}y^{n} e^{- y}, & y\gt 0 \\ 0, &\mbox{ otherwise} \end{cases} }[/math]

(use induction or the fact that sum of independent identically distributed exponential random variables has Gamma distribution; the latter can be shown using moment-generating functions).


Example: sums of uniformly distributed random variables

Let [math]\displaystyle{ Y_1, Y_2, \dots , Y_n,\dots }[/math] be independent random variables uniformly distributed on [math]\displaystyle{ [0,1] }[/math] . Let [math]\displaystyle{ X_n = Y_1+Y_2+\dots +Y_n }[/math]. We will find the densities of [math]\displaystyle{ X_2, X_3, X_4 }[/math] and graph them.

Note that each [math]\displaystyle{ Y_n }[/math] has the density [math]\displaystyle{ f(y) = 1 }[/math] for [math]\displaystyle{ 0\le y \le 1 }[/math] (and [math]\displaystyle{ 0 }[/math] outside of [math]\displaystyle{ [0,1] }[/math] ). Further, the density [math]\displaystyle{ f_n }[/math] of [math]\displaystyle{ X_n }[/math] is

[math]\displaystyle{ f_n (y) = (f*f*\dots *f) (y) }[/math] ([math]\displaystyle{ n }[/math] -fold convolution).

Computing these convolutions (either directly or using software) we get:

[math]\displaystyle{ f_1(y) = \begin{cases} y, & 0\le y \le 1 \\ 0, &\mbox{ otherwise} \end{cases} }[/math]

Uniform density
Uniform density

[math]\displaystyle{ f_2(y) = \begin{cases} y, & 0\le y \le 1 \\ 2-y, & 1\le y \le 2\\ 0, &\mbox{ otherwise} \end{cases} }[/math]

Density f2
Density of Y1+Y2

[math]\displaystyle{ f_3(y) = \begin{cases} \frac{y^{2}}{2} & 0\le y \le 1 \\ - y^{2} + 3 x - \frac{3}{2} & 1\le y \le 2 \\ \frac{y^{2}}{2} - 3 y + \frac{9}{2} & 2\le y \le 3\\ 0 &\mbox{ otherwise} \end{cases} }[/math]

Density f3
Density of Y1+Y2+Y3

[math]\displaystyle{ f_4(y) }[/math]

Density f4
Density of Y1+Y2+Y3+Y4