\documentstyle[12pt,fullpage]{article} \begin{document} \begin{center}\bf Example of E-M: Genetic Linkage Problem \end{center} Rao (1973) considers 197 animals randomly divided into four categories as follows: \[ Y = (y_1, y_2, y_3, y_4)^T = (125, 18, 20, 34)^T \] with cell probabilities \[ ( 1/2 + \theta/4, (1-\theta)/4, (1-\theta)/4, \theta/4 )^T \] The observed-data likelihood is \[ g(Y|\theta) \propto \cdot (2 + \theta)^{y_1}(1-\theta)^{y_2+y_3} \theta^{y_4} \] If we ``augment'' the data by splitting the first category into two, \begin{eqnarray*} X & = & (x_1, x_2, x_3, x_4, x_5)^T \\ \mbox{with probabilities} & & (1/2, \theta/4, (1-\theta)/4, (1-\theta)/4, \theta/4 )^T \end{eqnarray*} the complete-data likelihood is now simpler: \[ f(X|\theta) \propto \theta^{x_2+x_5} (1-\theta)^{x_3+x_4} \] The ``missing data'' $Z$ is like a coin toss for which category each observation of $y_1$ gets split into: \[ (Z|Y,\theta) \sim Bin(y_1, 2/(2+\theta)) \] Now we compute the $Q$-function, \begin{eqnarray*} Q(\theta,\theta^k) & = & E[(x_2+x_5)\log(\theta) + (x_3 + x_4)\log(1-\theta) ~|~ Y,\theta^k] \\ & = & (y_1\theta^k/(2+\theta^k) + x_5)\log(\theta) + (x_3+x_4)\log(1-\theta) \end{eqnarray*} \end{document}