2012年9月15日星期六

Normal approximation of multinomial distribution

How to simulate relative frequency outcomes of a multinomial experiment using normally distributed random numbers? The answer is surprisingly simple, but for some curious reason, the answer is seldom mentioned on the internet.

Let $\mathbf{X} = (X_1, \ldots, X_m) \stackrel{\textrm{i.i.d.}}{\sim} \textrm{Multinomial}(n, \mathbf{p})$, where $\mathbf{p}=(p_1,\ldots,p_m)$ is a probability vector whose entries sum to 1. In other words, we are talking about $n$ independent trials of a multinomial experiement, in which the probability of getting outcome $i\in\{1, 2, \ldots, m\}$ in each trial is $p_i$, and $X_i$ is the frequency count for outcome $i$ after all $n$ trials are completed. We have the following:

Theorem. Let $u = (\sqrt{p_1},\ldots,\sqrt{p_m})$ and $Q$ be a real orthogonal matrix whose last column is $u$. Suppose $Z_1, \ldots, Z_{m-1}$ be i.i.d. standard normal random variables. Then
$\mathbf{X}^\ast = \left( \frac{X_i-np_i}{\sqrt{np_i}} \right)_{1\le i\le m} \stackrel{d}{\longrightarrow}\quad Q \begin{bmatrix} Z_1\\ \vdots\\ Z_{m-1}\\ 0\end{bmatrix}$as $n\rightarrow\infty$.

Proof. See sec. 11.1 of Hans-Otto Georgii (2007), Stochastics: Introduction to Probability and Statistics, Walter de Gruyter, Berlin.

*****************************
Note that the denominator of the $i$-th entry of $\mathbf{X}^\ast$ in the above theorem is $\sqrt{np_i}$, not the usual $\sqrt{np_i(1-p_i)}$ we see in the normal approximation formula to binomial distribution. We will immediately see that the binomial case is just a special case of the above general formula. Let $m=2$ and write $\mathbf{p}=(p,q)^\top$. Take $Q$ as $\begin{pmatrix}\sqrt{q} & \sqrt{p}\\ -\sqrt{p} & \sqrt{q}\end{pmatrix}$. So, the above theorem says that
$\begin{bmatrix} \frac{X_1-np}{\sqrt{np}}\\ \frac{X_2-nq}{\sqrt{nq}}\end{bmatrix}\stackrel{d}{\longrightarrow} \begin{pmatrix}\sqrt{q} & \sqrt{p}\\ -\sqrt{p} & \sqrt{q}\end{pmatrix} \begin{bmatrix} z\\ 0\end{bmatrix}= \begin{bmatrix} \sqrt{q}\,Z\\ -\sqrt{p}\,Z\end{bmatrix}.$Hence, by Continuous Mapping Theorem, we can divide both sides entrywise by $(\sqrt{q}, -\sqrt{p})^\top$ and get
$\begin{bmatrix} \frac{X_1-np}{\sqrt{npq}}\\ \frac{nq-X_2}{\sqrt{npq}}\end{bmatrix} \stackrel{d}{\longrightarrow} \begin{bmatrix} Z\\ Z\end{bmatrix}.$
Since $X_1+X_2= n$, we have $nq-X_2=X_1-np$. Hence the above convergence reduces to the familiar normal approximation formula to binomial distribution.

The theorem requires the use of a real orthogonal matrix $Q$ whose last column is $u$. How to construct such a matrix? See my previous blog entry.

By the theorem, the covariance matrix of the approximating distribution is given by $Q\ \mathrm{diag}(1,\ldots,1,0)\ Q^\top = I-uu^\top$. This matrix is, of course, degenerate because the $X_i$'s are not independent of each other (as $\sum_{i=1}^m X_i=n$).