688
Stephen E. Fienberg, Matthew S. Johnson, and Brian W. Junker
One of the major objections to the standard multiple-recapture
approach to population estimation is the assumption of homogeneity of
individual ``capture'' probabilities. Modeling individual capture
heterogeneity is complicated by the fact that it shows up as a
restricted form of interaction between lists in the contingency table
cross-classifying list memberships for all individuals. Traditional
log-linear modeling approaches to capture-recapture problems are
well-suited to modeling interactions among lists, but ignore the
special dependence structure that individual heterogeneity induces. A
random-effects approach, based on the Rasch (1960) model from
educational testing and introduced in this context by Darroch, et
al. (1993) and Agresti (1994), provides one way to introduce the
dependence resulting from heterogeneity into the log-linear model;
however, previous efforts to combine the Rasch-like heterogeneity
terms additively with the usual log-linear interaction terms suggest
that a more flexible approach is required. In this paper we consider
both classical multi-level approaches, and fully Bayesian hierarchical
approaches to modeling individual heterogeneity and list
interactions. Our framework encompasses both the traditional
log-linear approach and various elements from the full Rasch model. We
compare these approaches on two examples, the first arising out of an
epidemiological study of a population of diabetics in Italy, and the
second a study intended to assess the ``size'' of the World Wide
Web. We also explore extensions allowing for interactions between the
Rasch and log-linear portions of the models in both the classical and
Bayesian contexts.
Keywords: Log-linear models; Markov chain Monte Carlo methods; Multiple-recapture census; Quasi-symmetry; Rasch model.
Here is the full postscript text for this technical report. It is 407337 bytes long.