THE STORY OF THE CENTRAL LIMIT THEOREM
Loh Wei Yin
The central limit theprem (CLT) occupies a place of
honour in the theory of probability, due to its age, its
invaluable contribution to the theory of probability and
its applications. Like a~l other limit theorems, it
essentially says that all large-scale random phenomena 1.n
their collective action produce strict regularity. The
limit la\-J in the CLT is the well-known Normal distribution
from which is derived many of the techniques in statistics,
particularly the so-called 11 large sample theory" .
Because the CLT is so very basic, it has attracted the
attention of numerous workers. The earliest work on the
subject is perhaps the theorem of Bernoulli (1713) which 1.s
really a special case of the Law of Large Numbers. De
Moivre (1730) and.Laplace (1812) later proved the first
vers.ion of the CLT. This was generalized by Poisson to
constitute the last of the main achievements before the
time of Chebyshev.
The theorems mentioned above deal with a sequence of
independent events ~ 1 ,~ 2 ,~ 3 , ... , with their respective
probabilities denoted by p = P(l; ). The number of actually n n occurring events among the first n events t; 1 , ... ,;n is
denoted by the random variable Z . The above-mentioned n
results can now be stated as follows. (The first two theorems
have pn = p for all n, and 0 < p < 1.)
1. Bernoulli's Theorem. For every E > 0,
z P( Inn -PI> s) ~ 0 as n ~ oo
1 . .....
- 35 -
2. Laplace's Theorem
z n
- np
as n + oo uniformly with respect to z 1 and z2
.
We have used the notation
4?(x) = t"2_
1 fx e ·-2 dt
J2TI J -oo
which 1s the standard Normal distribution functio.n.
3. CLT in Poisson's Form
Then
as n
Let A = p 1+ ••• +p , D~ = p 1 C1 -p 1 )~ ... ~p (1-p l . n n n n n
Z -A P(zl < ~ n < z2) + ¢(z2) - ¢(zl)
n
uniformly with respect to z1
and z 2 .
If we introduce the indicator random variable
if l; occurs I~
if ~ does not occur, =
Zn can be vJri tten as
z n =IE. +It:+ ••• + It: •
~1 ~2 ~n
Thus the above three theorems are 1n fact special c ases of
limit theorems concerning sums of independent random v2~i~~~s= .
- 36 -
The rigorous proof of the more general CLT for sums
of arbitrarily distributed independent random variables was
made possible by the creation in the second half of the
nineteenth century of powerful methods due to Chebyshev,
whose work signalled the dawn of a new development in the
entire theory of probability.
Chebyshev considered a sequence of independent random
variables x1 , x2 , ... , Xn,··· with finite means and variances,
denoted respectively by a =EX b 2 = E(X -a ) 2 • Let . n n' n n n
S = X1
+ •.. + X , A = a1
+ ... +a ,and B2 = b 12 + ... + b 2 • n n n n n n
Chebyshev studied and solved the folloHing problem.
Problem. V.Jhat additional conditions ensure the
validity of the CLT:
P( S -A n n
B n
< z) -+- cl>(z)
for every real z as n + ~?
To solve this problem, Chebyshev created the method
of moments . . His proof, in a paper in 1890, was based on a
lemma which was proved only later by Markov (1899). Soon
afterwards, Lyapunov (1900, 1901) solved the same problem
under considerably more general conditions using another
method, although Markov later showed that the method of
moments is also c~pable of obtaining Lyapunov's theorem.
However, it turned out that Lyapunov's method was simpler
and more powerful in its application to the whole class of
limit theorems concerning sums of independent variables.
This is the method of characteristic functions using
Fourier analytic techniques. It is so powerful that to
d~te no other method can yield. better results for the case
of independent random variables.
The condition Lyapunov used to solve Chebyshev 1 s
problem was
- 37 -
) [ i
::.
lim c /B2+o = o , n n n-+oo
An even weaker condition 1s the famous Lindeberg
condition that for every 6>0.
lim n-+oo
where Fk is the distribution of Xk. Subsequently Feller
(1937) showed that the Lindeberg condition is not only
sufficient but also necessary for the limit law to be
normal, provided an appropriate uniform asymptotic
negligibility of the X./B is assumed. 1 n
In practical applications the CLT is used essentially
as an approximate formula for "suff~ciently large values of
n. In order that this use 1s justified, the formula must
contain an estimate of the error involved. One \.Jay of
doing this is to consider the various asymptotic expansions
for the distribution
S -A Fn(x) =PC~ n < x).
n
In his 1890 paper Chebyshev indicated without proof the
following expansion for the difference F n (x) ··· ~ (x), TtJhen
the random variables are identically distributed:
where the Qi(x) are polynomials. The most definitive resul T~
ln this direction are due to Cramer. Edgeworth (1965) studicC
in detail the expans1on 1n a slightly different form.
~ 3 8 --
· When the random variables are·identica11ydistributed
and possess finitethird moments,,Bei'.'ry (1941) and Esseen
{1945) independently proved the celebrated result
K f3 -·--rr2 ' JTI a
. ~;.;rhe:r>e:f3 = EIX1 -Ex1 ! 3 ,.a 2 =EX~- (EX1 ) 2 and K 1.s a constant.
L~ter results have ~ gene:r>alized this to the case of non
identically distributed summands with the best p()und
·achiev~d by Esseen (1969) in terms of truncated third
mbments.
. A natural question generated by · Lyapunov 1 s CL'l' is
whether the condition that the random variables be indeper-ldent
can be generalized. It ~as forty-seven yea~s ·later before
Hocffding and Robbins ( 1948) proved a CLT for an m-depend:::nt
sequence of random variables. (The conc.ept of m-dependenc2
~ssentially r~quires that given the sequence xl,x2'''''xn''"''
it ism-dependent if cx1,x,, ... ,X) is independent of
L r . (Xs,Xs+l'''') for s-r>m. In .this terminology an independ2nt
sequence is 0-dependent.) Later Diananda (1955) and O::::>e:y
(l958) improved on this result by assuming only Lindeber g 1 s
condition and the boundedness of the . sum of the individual
·variances.
A~most at the same time, ~osenblatt (1956) proved a
CLT for a ' '' strong mixing 11 sequence·. This condition requir ::s
. only that the d6pend~nce ~etwee~ X and X +' diminishes =2 . n . _ n K
k increases. Thus m-dependenc~ ~s incl~ded as a speci~l
case. Rosenblatt's results tvere subsequently improve.r2 b·.'
Philipp (1969a, 19~gb) wh6 not only relaxed the form0r's
conditions b1it also obtained bounds for the error in th::;
norrnal approximation. Soon after, in the Sixth Berkel o::;:
Symposium. Dvoretzky (1972) presented 'v'ery general result ::.;
for dependent random variables. For the particular caE a
of strong mixing, he went beyond Philipp's (1 969a) theors~
- 39 -
by dropping the condition that the variables be uniformly
bounded. In this connection, the author and Chen ~2J have added a refinement to one of Dvoretzky's theorems.
Recently too, McLeish (1974) has made improvements on
Dvoretzky's paper.
At the same symposlum ln which Dvoretzky presented
his results, an equally interesting paper was given by
Stein (1972). This paper is concerned with bounding the
error in the normal approximation for dependent random
variable. Its significance lies not so much in its improve
ment of known results, Hhich it did manage handsomely, but
rather in its introduction of a new method vastly different
from the established Fourier techniques. The method, which
makes no use of characteristic functions, essentially
d8pends upon an identity and a perturbation technique.
The interest created by Stein's paper was almost
i~~ediate. Chen (1972) used it to give an elementary proof
of the CLT for i~dependent random variables while Erickson
(1974) obtained an 1 1 bound for the error for m-dependent
sums. The latter has since been generalized by the author
and Che n \2ll to ¢-mixing sequences. Chen [9] has - L--- _)
meam.Jhile employed a variation of Stein's method to obtain
nc:c e ssary and sufficient condi t:.i.ons for the dependent
central limit problem where the limit law need not be
normal. In the case that the limit is normal, the author
and Chen [2 2-1 have improved on the existing results for _,
strong mixing s e quences. Although Stein's method appears
to be more easily applicable to dependent random variabl e s .,
the classical Fourier method lS still superior for independ8nt
variab l e s . This is because to date no one has been able t c
apply Stein's method to yield the classical Berry-Esseen
theorem.
I n yet another direction of generalization~ Markov
~v-a.s among the first to prove a multidimensional CLT, vJh2:;_~.,:;
the sequence of random variables is now a sequence of
independent random vectors. The limit law then becomes the
multidimensional Gaussian distribution. Apart from the
extra work of dealing with matr~ces, the proof of the
multidimensional ,_ C~T appears to be a simple extension · of
the one-dimensional case.
The borresponding problem of bounding the error 1n the
multi-dimensional CLT is more interesting. Among the first
to look for estimates was Rao (1961). He was closely
foildwed by a host . of others, mainly Russians, like Bikjalis
(1966), von Bahr (1967), Bhattacharya - (1968), Sadikova (1968)
Sazanov (1968), Bergstrom (1969), Paulanskas (1970) and
Rota.r~ (1970). With the exception of the last two, all the
authors mentioned above considered only :independent and
identically distribu-ted random vectors. The last two
dropped the assumption of identical distributions. \fuen. 1 .
third moments exist~ p.n or•der of n -~ is obtained' which is
equivalent to the Ber~~-E~seen rate. However, this is
only possible for the class of convex Borel sets. In fact,
Bikjalis (1966) has shown that for arbitrary Borel sets,
additional conditions had ~o be assumed.
This is therefore the present situation regarding
developments in. the study of -the CLT. There are still
many nagging questions left to be asnwered, particularly
in bounding the error in the normal approximation. By
considering coin-tossing, it is seen that the rate given
in the Berry-Esseen theorem is achi~ved and hence further
work on this may only be found in reducing the absolute
constant. A more challenging problem is to obtain a prope r
generalization to dependent variables. So far, all
estimates, with the exception of ~hat of Stein (1972), de
not red~be to the Berry-Esseen rate. Stein (1972) obtaineJ 1.:
the correct order of n- 2 for a sequence of stationary
n-dcpendent random variables with eighth moments. The -~ others manage at best an order of n (see e.g. Philipp
(1969b), Erickson (1974), Loh and Chen [21] ) fer more
general types of dependence. Another problem awaiting
- 41 -
future rese'arch is to get bound~ for the corresponding
rnultidirn2nsiona1 case for dependent random vector!3. There
does not appear to have bben any work:' on this pr()blem yet.
vJork on the CLT has generated much interest in
related problems like the Poisson approximation and the
Central Limit Prohlem. With the , latter are associated some.
of the great pioneers in probability like Lev;y, . Khi tchine
and more recently, Kolmogorbv .' To retrace their work ~,,1ould
I"e'quire another essay as long as the present. ,_, .. , __ )
It lS perhaps justifie~ to add that no othep topic in
the theory of probability has attracted so much attention
for so long as the CLT. F?r two hundred and fifty years
since its birth, the CLT . has held man 1 s f~sc;i.p.ation and 'l.vill
continue to do so for many years to come.
References
von ~ahr, B. (1967). On the central limit theorem in
Rk. Ark. Mat~ 7, 5, 61-69.
~l Bergstrom, H. (1969). On the central limit theorem
in Rk. The remainder term for special Borel sets.
Z. · Wahrscheinlichkeitstheorie 14, 113-126 ~ -
Bernoulli, J. (1713). Conjestandi. Basle.
( 1 i_lr J Be rry, A.C. (1941). The accuracy of the Gaussian
a pproximation to the sum of independent variates.
Trans. Amer. Math. Soc. 49, 122-136.
bl Bhattacharya, R.N. (1968). Berry-Esseen bounds for
the multi-dimensional central limit theorem. Bull.
Amer. Math. Soc. 74, 285-287.
[s] Bik:j a lis, A. ( 19 66) . The remainder terms in multi
dimensional limit theorems. Soviet Math. DokZ. 7,
705-707.
- 42 -
. . . ..
Chebyshev, P·.L. (1890) .·Sur deux theorems re.latirs
awe probabilites. Acta Math. 14, 305-315.
(~ Chen, L6uis H.Y. (1970). An elementa~y proof of the
central 1imi t theorem. Bu ~ L Singapore Math. Soa.
1·-12.
~] Chen, Louis H.Y. A riew approach to dependent central
limit problems. To appear.
~~ Diananda, P.H. (1955). The central limit theorem for
m-dependent variables. Iroc. Camb. Fhi ~. Soc. 51, 92 .. g 5.
~~ Dvoretzky, A. (1972). Asymptotic normality for sums
of dependent random variables. Froc. Sixth Berkeley
Symp. Math. Statist. Prob. 2~ 513-535.
[1~ Edgeworth, f. Y. (1905). The law of error. Proc. Camb.
PhiZ.. Soc. 20,36-65.
[1~ Er·ickson, R.V . . (1974). L1 bounds for asymptotic
normality of rn-dependent sums using Stein's technique
Ann. l:Y.obabiZ.ity. 2, 532-529.
'[lq Esseen, e.G. (1945). A mathematical study of the
Laplace-Gaussian law. Acta Math. 77, 1-125.
r1~ Esseen, C.G. (1969). On the remainder term ln the
central limit theorem. Ark. Mat. 8, 7 - 15.
~~ feller, W. (1S37). Ueber den zentralen Grenzwerksatz
der Wahrscheinlichkeitsrechnung. Math. Z. 42, 301-312.
~i Gnedenko, B.V. and Kohmogorov, A,N. Limit distributions
for sums of independent random variables. Addison
Wesley . 1954.
~1 Hoeffding, W. and Robbins, H, (1948). The central
limit theorem for dependent random variables. Duke
Math. J. 15, 773 ·-780.
- 43 -
[19] Laplace, R. (.1812) • T heqrie ana Z.y tiqu {:! iir>a 1:n:-o hrrb-t 7>! t: o ~ . Paris.
Lindeberg, Y.W. (1922). Ei:he nene Herleitung d e s
Exponentialgesetzes in der Wahrscheinlichkeitsrechnung .
Math. Z. 15, 21f-22S.
Loh, Wei--Yin and Chen, Louis · 11. Y. The L., normal J..
approximation for ¢-mixing random variables. To
appear.
[2~ Loh, Wei-Yin and Chen, Louis H~Y. On the asymtotic
normality for sums of strong mixing random variables.
To appear.
)}~ Lyapunov, A.M. (1900). Sur une' proposition de la
theorie des probabilites. BuZ.Z.. de Z. 'Acad. Imp. des
Sci. de Mt. Fetersbour•g. 13,359--386.
pq Lyapunov, A.M. (1901). Nouvelle forme du theoreme sur
la limite de probabilites. Mem. Acad. Sc. St.
Fetersbourg 12, 1 - 24.
~j Markov, A.A.(1899). The law of large numbers and the
method of least squares. Izv. Fiz. ~Mat. Obshchestva
Kazan. Univ. 8, 110-128.
[2s] t1cLeish, D.L. (1974). Dependent Central Limit Theorems
and Invariance Principles. Ann Frobabi Z.ity 2, 620··628.
~~J de t1oivre, A. (1730). MisceZ.Z.anea AnaZ.ytica
SuppZ.ementum. London.
[20 Orey, S.A. (195b). Central limit theorem form
dependent random variables. Duke Math. J. 25, 543-546.
~~ Paulauskas~ V. (1970). The multi-dimensional central
limit theorem. Litovsk. Mat. Sb. 10, 783-789.
- 44 ·-
[3oJ Philipp, VJ. (1969a). The central limit problem for
mixing sequences of random variables. Z. Wahrachein
- lichkeitstheorie. 12, 255-171.
b~ Philipp, W. (1969b). The remainder in the central
limit theorem for mixii;J.g stochastic processes. Ann.
Math. Statist. 40, \ 601-609. ,
~-~ .Rao,: ,R. Ranga {_1961). On the central limit theore'm
~n Rk. BuZZ. Amer. Math. Soc. 67, 359~g6l.
Rosenblatt, M. (1956). A central limit ~heorem and . . . • ,
a mlxlng condition. !Toe. nat. Acad. Sci. USA. 42,
412-413.
~~ Rotav, V.I. (1970). The rate of convergence in the
multi-dimensional central limit theorem. Theor. Frob.
Appl. 15, 354-356.
~~- Sadikova, S.M~ (1968). On the . multi-dimensional
central limit. theorem. Th_eor. F1"ob. AppZ. 13, 164· ·170.
[3ij Sazanov, V.V. (1968). On the multi-di~ensional centra:
limit theorem. Sankhya, Ser. A, 3b pg.2.
~~ Stein, C. (1972). A bound for the error in the norm~1 approximation to the distribution ,C!f a. s.um o_f .
dependent random variables. !Toe. Sixth Berkeley Sy m~ .
Math. Statist. Prob. 2, 583·-602.
- 45 -