Classical Multi-level and Bayesian Approaches to Population Size Estimation Using Multiple Lists

Stephen E. Fienberg, Matthew S. Johnson, and Brian W. Junker

Abstract:

One of the major objections to the standard multiple-recapture approach to population estimation is the assumption of homogeneity of individual ``capture'' probabilities. Modeling individual capture heterogeneity is complicated by the fact that it shows up as a restricted form of interaction between lists in the contingency table cross-classifying list memberships for all individuals. Traditional log-linear modeling approaches to capture-recapture problems are well-suited to modeling interactions among lists, but ignore the special dependence structure that individual heterogeneity induces. A random-effects approach, based on the Rasch (1960) model from educational testing and introduced in this context by Darroch, et al. (1993) and Agresti (1994), provides one way to introduce the dependence resulting from heterogeneity into the log-linear model; however, previous efforts to combine the Rasch-like heterogeneity terms additively with the usual log-linear interaction terms suggest that a more flexible approach is required. In this paper we consider both classical multi-level approaches, and fully Bayesian hierarchical approaches to modeling individual heterogeneity and list interactions. Our framework encompasses both the traditional log-linear approach and various elements from the full Rasch model. We compare these approaches on two examples, the first arising out of an epidemiological study of a population of diabetics in Italy, and the second a study intended to assess the ``size'' of the World Wide Web. We also explore extensions allowing for interactions between the Rasch and log-linear portions of the models in both the classical and Bayesian contexts.

Keywords: Log-linear models; Markov chain Monte Carlo methods; Multiple-recapture census; Quasi-symmetry; Rasch model.

Here is the full postscript text for this technical report. It is 407337 bytes long.