Loading [MathJax]/jax/output/CommonHTML/jax.js

2012年9月15日星期六

Normal approximation of multinomial distribution

How to simulate relative frequency outcomes of a multinomial experiment using normally distributed random numbers? The answer is surprisingly simple, but for some curious reason, it is seldom mentioned on the internet.

Let X=(X1,,Xm)i.i.d.Multinomial(n,p), where p=(p1,,pm) is a probability vector whose entries sum to 1. In other words, we are talking about n independent trials of a multinomial experiement, in which the probability of getting outcome i{1,2,,m} in each trial is pi, and Xi is the frequency count for outcome i after all n trials are completed. We have the following:

Theorem. Let u=(p1,,pm) and Q be a real orthogonal matrix whose last column is u. Suppose Z1,,Zm1 are m1 i.i.d. standard normal random variables. Then
X=(Xinpinpi)1imdQ[Z1Zm10]as n.

Proof. See sec. 11.1 of Hans-Otto Georgii (2007), Stochastics: Introduction to Probability and Statistics, Walter de Gruyter, Berlin.

*****************************
Note that the denominator of the i-th entry of X in the above theorem is npi, not the usual npi(1pi) we see in the normal approximation formula to binomial distribution. We will immediately see why the binomial case is just a special case of the above general formula. Let m=2 and write p=(p,q). Take Q as (qppq). So, the above theorem says that
[X1npnpX2nqnq]d(qppq)[z0]=[qZpZ].Hence, by Continuous Mapping Theorem, we can divide both sides entrywise by (q,p) and get
[X1npnpqnqX2npq]d[ZZ].
Since X1+X2=n, we have nqX2=X1np. Hence the above convergence reduces to the familiar normal approximation formula to binomial distribution.

The theorem requires the use of a real orthogonal matrix Q whose last column is u. How to construct such a matrix? See my previous blog entry.

By the theorem, the covariance matrix of the approximating distribution is given by Q diag(1,,1,0) Q=Iuu. This matrix is, of course, degenerate because the Xi's are not independent of each other (as mi=1Xi=n).

沒有留言: