\documentclass{article} \usepackage[urw-garamond]{mathdesign} \usepackage{amsmath, amsfonts} \usepackage{natbib} \usepackage{hyperref} \begin{document} \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Binom}{\mathrm{Binom}} <>= # General set-up options library(knitr) opts_chunk$set(size="small", background="white", cache=TRUE, autodep=TRUE, tidy=FALSE, warning=FALSE, message=FALSE, echo=TRUE) @ \title{Lecture 21: Compartment Models} \author{36-467/36-667} \date{15 November 2018} \maketitle \begin{abstract} Compartment models'', which track the transitions of large numbers of individuals across a series of qualitatively-distinct states (the compartments'') are common in fields which study large populations, such as epidemiology and sociology. These notes cover such models as a particular kind of Markov chain, including some examples from epidemiology, evolutionary economics, and demography. I made some mistakes in lecture; these notes correct them. \end{abstract} Epidemiology, ecology, sociology, demography, chemical kinetics, and some other fields are sometimes grouped together as population sciences'', which study the behavior and evolution of large populations of individuals. All of these fields make extensive use of what are called (among other things) compartment models'', where the members of the population are divided into a discrete set of qualitatively-distinct states (or compartments''), and transition between them at rates which vary as functions of the current distribution over compartments\footnote{Whether the ubiquity of these models reflects some underlying commonality across these fields, or just a tendency for scientists with similar mathematical training to borrow and/or re-invent similar models, is beyond the scope of this note.}. They can be used whenever a population is divided into discrete, qualitatively-distinct types or classes, and transitions between classes are stochastic. \section{Compartment Models in General} \subsection{Basic Definitions and Assumptions} Let's begin with the case of a closed'' population of constant size$n$. (\S \ref{sec:demography} will consider the modifications needed to make the population size vary over time.) This population is divided into$r \geq 2$qualitatively distinct classes, types or compartments\footnote{These words are all synonyms in the present context.}, and we are primarily interested in how many individuals are located in each compartment. We thus define the state of the population at time$t$,$X(t)$, as a vector, $$X(t) \equiv [X_1(t), X_2(t), \ldots X_r(t)]$$ where$X_i(t)$counts how many individuals are in compartment$i$at time$t$. The state$X(t)$changes because individuals can move from one compartment to another. We assume that these moves or transitions follow a stochastic process, and we make two important assumptions about that process. \begin{enumerate} \item The$i$-to-$j${\bf transition rate}$p_{ij}$is the probability that an individual in compartment$i$will move to compartment$j$at the next time step. ($p_{ii}$is thus the probability of staying in compartment$i$.) We assume that the transition rates are functions of the {\em current} population state, and not the past (or the future), so$p_{ij}(X)$is a well-defined function. Note that this assumption allows some$p_{ij}$to be zero, either always or for certain states of the population. (These are sometimes called {\bf forbidden} or {\bf suppressed} transitions.) It also allows some$p_{ij}$to be constant, regardless of$X$. \item All individuals in the population make {\em independent} transitions, conditional on$X(t)$. The moves of all the individuals in a given compartment are thus {\em conditionally} independent and identically distributed. \end{enumerate} These assumptions are {\em not} implied by the mere fact that we've divided the population into compartments and are tracking the number of individuals in each compartment. They could be false, or true only for a different choice of compartments, etc. --- You may feel some scruples about applying these assumptions to people, at least to social processes which reflect human actions (as opposed to, say, models of disease spread). But modeling aggregated social processes as stochastic in this way does not commit us to saying that people act randomly, or don't have motives for their actions, or that all individuals in the same compartment are identical. What it does commit us to is that the differences between people in the same compartment which determine their motives and their actions are so various, so complicated, and so independent of the circumstances which landed them in that compartment that they can't be predicted, and can be effectively treated as random\footnote{There are precise technical senses in which any sufficiently complicated function is indistinguishable from randomness. For a brilliant and accessible introduction to these ideas, and how they underlie probability modeling, I strongly recommend \citet{Ruelle-chance}. The standard textbook on this subject is \citet{Li-and-Vitanyi-1997}. \citet[ch.\ 3]{Salmon-1984} applies these ideas to the problem of defining the'' probability of an outcome for members of some class.}. \subsection{Consequences of the Assumptions} These assumptions are enough to tell us that the sequence of random variables$X(t)$is a Markov chain. If it's a Markov process, it has to be a Markov {\em chain} because the state space is discrete, though large (Exercise \ref{exercise:size-of-state-space}), so what we really need to do is show that the Markov property holds. The easiest way to get there is through defining some additional variables. \subsubsection{Flow or Flux Variables} For each pair of compartments$i$and$j$, define$Y_{ij}(t+1)$as the number of individuals who moved from compartment$i$to compartment$j$at time$t+1$. This is sometimes called the {\bf flow} or {\bf flux} from$i$to$j$.$Y_{ij}(t+1)$is a random quantity. By the assumptions above, in fact, it has a binomial distribution: $$Y_{ij}(t+1) \sim \Binom(X_i(t), p_{ij}(X(t)))$$ This implies that \begin{eqnarray} \Expect{Y_{ij}(t+1)|X(t)} & = & X_i(t) p_{ij}(X(t))\\ \Var{Y_{ij}(t+1)|X(t)} & = & X_i(t) p_{ij}(X(t))(1-p_{ij}(X(t))) \end{eqnarray} There are two implications of these worth noting. \begin{enumerate} \item Both the expected flux and the variance in the flux are proportional to$X_i(t)$. This means that the standard deviation in the flux is only proportional to$\sqrt{X_i}$. Thus, if$X_i(t)$is large, the {\em relative} deviations in the flux are shrinking. This suggests that there should be nearly-deterministic behavior in large populations. \item The assumption that each individual in compartment$i$has the {\em same} probability of moving to compartment$j$is crucial to calculating the variance. If there were randomly-distributed transition rates for all individuals in the compartment, we'd see a higher variance for the flux (Exercise \ref{exercise:heterogeneous-transition-rates}). \end{enumerate} If$r > 2$, then the set of fluxes out of$i$,$[Y_{i1}(t+1), Y_{i2}(t+1), \ldots Y_{ir}(t+1)]$has as its joint distribution a multinomial, with the number of trials'' being$X_i(t)$, and the probabilities being$(p_{i1}, p_{i2}, \ldots p_{ir})$. (Remember that the marginal distributions of a multinomial distribution are binomial.) \subsubsection{Stock-Flow Consistency} The variables$X_i(t)$count the total number of individuals in each compartment, regardless of how or when they arrived there. Such variables are sometimes called {\bf stock} or {\bf level} variables. The$Y_{ij}(t)$variables, on the other hand, are (as already said) {\bf flow} or {\bf flux} variables. A basic principle is that the new level of the stock is the old level, plus flows in, minus flows out: \begin{eqnarray} \label{eqn:stock-flow} X_i(t+1) & = & X_i(t) + \sum_{j\neq i}{Y_{ij}(t+1) - Y_{ji}(t+1)}\\ \text{(new level)} & = & \text{(old level)} + \sum{\text{(in flow)} - \text{(out flow)}} \end{eqnarray} This is sometimes called {\bf stock-flow consistency}. \subsubsection{Markov Property} Because Eq.\ \ref{eqn:stock-flow} says how the level in each compartment changes, it tells us how the state vector$X(t)$changes as well. Notice that, from our assumptions,$Y_{ij}(t+1)$is independent of the past given$X(t)$. It is independent of earlier values of$X$, and also independent of earlier values of the$Y$'s. Since Eq.\ \ref{eqn:stock-flow} tells us that$X(t+1)$is a deterministic function of$X(t)$and$Y(t+1)$, it follows that$X(t+1)$is independent of earlier$X$'s (and$Y$'s) given$X(t)$. Hence the state of the population follows a Markov chain. Because the state space, while large, is finite, the Markov chain will eventually enter some closed, irreducible set, and stay there forever. In the long run, the distribution of$X(t)$will follow the invariant distribution for that irreducible set. These are properties of any finite Markov chain, and so apply to this one, even though its has special structure. \section{Epidemic Models} In epidemiology, compartment models are often known as epidemic models'', and in that field they descend from pioneering work in mathematical epidemiology done in the early 20th century\footnote{It is conventional these days to attribute this to the work of Sir Ronald Ross on malaria, but early reviews (e.g., \citealt[ch. VIII, pp. 79--82]{Lotka}) make it clear that this was much more of a collective effort.}, but they are applied much more broadly than just to epidemics. It's worth thinking through some of the most basic cases, to get a feel for how they work. \subsection{SI} \label{sec:SI} Let's begin with a model with just two compartments, dividing the entire population into those who are infected with a certain disease --- call this compartment$I$--- and those who, not having caught the disease yet, are susceptible, in compartment$S$. The state of the population is$X(t) = (X_I(t), X_S(t))$, but since$X_S(t) = n - X_I(t)$, it is enough to keep track of$X_I(t)$. With two compartments, we need to specify two rates,$p_{SI}$and$p_{IS}$. The former is the rate at which susceptible individuals become infected; the later is the rate at which infected individuals are cured and become susceptible again. A simple assumption is that$p_{SI}(X_S, X_I) = \alpha X_I$, because the more members of the population are infected, the more likely a susceptible individual is to encounter one and become infected. We might, however, simply set$p_{IS} = 0$, indicating that no one recovers from the disease, but rather the infected remain infected, and contagious, forever. There are a couple of ways to justify this latter assumption. \begin{enumerate} \item We're in a zombie movie. (This {\em is} Pittsburgh, after all.) \item We're only interested in what happens on a comparatively short time scale. More precisely, whatever processes make infected individuals non-infectious (recovery, death, quarantine, etc.) happen much more slowly than the process of contagion itself, and we're only trying to approximate the beginning of the epidemic, before the former processes can kick in. \item The infected really do remain around, and contagious, forever (\S \ref{sec:diffusion-of-innovations}.) \end{enumerate} Leaving aside how to justify the assumption that$p_{SI} = 0$, what are the consequences of these assumptions about the transition rates? An obvious consequence is that if$X_I(t) = n$or$X_I(t) = 0$, then$X_I$will remain$=n$or$=0$forever. That is, these are {\bf absorbing} states of the population. You can convince yourself that these are the only absorbing states. You can also convince yourself that an absorbing state forms a closed, irreducible set of states, and that these are the only closed, irreducible sets for this chain (Exercise \ref{exercise:absorbing-states}). There are thus two possible invariant distributions, one which gives probability 1 to$X_I=0$and the other which gives probability 1 to$X_I = n$. The disease thus either dies out, or takes over the whole population. Finally, you can convince yourself that$X_I(t)$never decreases, so the disease must take over the population if it's ever present at all. To see what happens on the way there, we'll need to use the flow variables. There is only one flux that we need to worry about,$Y_{IS}$: \begin{eqnarray} X_I(t+1) = X_I(t) + Y_{IS}(t+1)\\ X_S(t+1) = X_S(t) - Y_{IS}(t+1) \end{eqnarray} Since$Y_{IS}$is binomially distributed, we can work out the expected change in the number of infected individuals. \begin{eqnarray} \Expect{X_I(t+1)|X_I(t)} - X_I(t) & = & X_S(t)\alpha X_I(t)\\ \label{eqn:logistic-growth} & = & \alpha X_I(t)(n-X_I(t)) \end{eqnarray} Notice that this growth is very small when$X_I \approx 0$, and again when$X_I \approx n$. There is an intermediate range where the growth is very rapid. Figure \ref{fig:logistic-growth} plots a typical expected course of$X_I(t)$. This is what's known as {\bf logistic} or {\bf sigmoidal}\footnote{Because somebody thought it that curve looked like an$S$, or perhaps the outline of$\sigma$.} growth. This can be caricatured as nothing seems to happen for a long time, then suddenly it explodes and grows exponentially, but eventually it levels off and saturates''. \begin{figure} <>= alpha <- 1e-6 n <- 1e5 xi <- vector(length=200) xi[1] <- 1 for (t in 2:length(xi)) { xi[t] <- xi[t-1] + alpha*xi[t-1]*(n-xi[t-1]) } plot(1:length(xi), xi, xlab="t", ylab=expression(X[I](t)), type="l") @ \caption{Illustration of the kind of behavior that results from Eq.\ \ref{eqn:logistic-growth}. This is shown with$n=10^5$,$\alpha=10^{-6}$, and$X_I(1) = 1$.} \label{fig:logistic-growth} \end{figure} Of course, Figure \ref{fig:logistic-growth} is comes from iterating the {\em deterministic} equation \ref{eqn:logistic-growth}. What happens if we actually follow the stochastic process? Figure \ref{fig:SI-trajectories} plots 100 realizations of the SI model, along with the deterministic approximation, and shows that while the {\em shape} is the same, the timing of the exponential take-off and of the eventual saturation becomes variable. \begin{figure} <>= sim.SI <- function(n, alpha, xi.1=1, T) { xi <- vector(length=T) xi[1] <- xi.1 for (t in 2:length(xi)) { xi[t] <- xi[t-1] + rbinom(n=1, size=n-xi[t-1], prob=alpha*xi[t-1]) } return(xi) } SI.sims <- replicate(100, sim.SI(1e5, 1e-6, 1, 200)) plot(1:length(xi), xi, xlab="t", ylab=expression(X[I](t)), type="l") for (sim in 1:ncol(SI.sims)) { lines(1:nrow(SI.sims), SI.sims[,sim], col="grey") } lines(1:length(xi), xi) legend("topleft", legend=c("simulations", "deterministic approximation"), col=c("grey", "black"), lty="solid") @ \caption{Comparison of stochastic simulations of the SI model (grey lines) to the deterministic approximation (black line). Parameters are as in Figure \ref{fig:logistic-growth}.} \label{fig:SI-trajectories} \end{figure} \subsubsection{Diffusion of Innovations} \label{sec:diffusion-of-innovations} A disease which, once caught, never goes away and remains infectious forever is a bit weird, biologically speaking. But the same ideas are also used to model the spread of new technologies and ideas, what is sometimes called the diffusion of innovations'', and then it makes sense to say that those who have caught'' an idea remain infected with it, and potentially contagious, for the rest of their lives. This week's homework explores this idea in the context of one of the classic data-sets on the diffusion of innovations, about the spread of the use of a new antibiotic among doctors in four towns in Illinois in the 1950s. The same notions, however, have been used to model the adoption of a wide range of technologies, practices, art forms, and ideologies \citep{Rogers-diffusion-of-innovations}. It's worth noting that the idea of analogizing the spread of ideas or behavior to the spread of disease is a very old one, but it's usually been an analogy deployed when people {\em disapprove} of what is spreading. (It is very rare to talk about infectious virtue''.) The oldest example I have found is the Roman writer and politician Pliny the Younger calling Christianity a contagious superstition'' ({\em superstitionis contagio}) in a letter to the Emperor Trajan in +110 ({\em Epistles} X 96.9), but the analogy has spontaneously recurred many, many times since then. It was also the basis for more systematic and scholarly studies, such as \citet{Siegfried-germs-and-ideas}, long before \citet{Dawkins-Selfish} coined the word meme''. \subsection{SIS} A logical next step is to keep two compartments, but to allow for movement back and forth between them. That is, some infected individuals can recover and become susceptible again. Again, a simple assumption is that infected individuals recover at a constant rate,$p_{SI} = \rho$, while we keep$p_{IS} = \alpha X_I$. Because this allows for movement from each compartment to the other, we need to keep track of {\em two} fluxes: \begin{eqnarray} Y_{SI}(t+1) & \sim & \Binom(X_S(t), \alpha X_I(t))\\ Y_{IS}(t+1) & \sim & \Binom(X_I(t), \rho) \end{eqnarray} Now \begin{eqnarray} X_I(t+1) & = & X_I(t) + Y_{SI}(t+1) - Y_{IS}(t+1)\\ X_S(t+1) & = & X_S(t) - Y_{SI}(t+1) + Y_{IS}(t+1) \end{eqnarray} To understand the dynamics here, start by considering what would need to happen for the expected change in$X_I$to be zero. \begin{eqnarray} \Expect{X_I(t+1)|X_I(t)} - X_I(t) = \alpha X_I(t) (n-X_I(t)) - \rho X_I(t) \end{eqnarray} Let's set this equal to 0 at$X_I(t) = X_I^*$. There are two solutions: \begin{eqnarray} X_I^*(\alpha (n - X_I^*) - \rho) & = &0\\ X_I^* & = & 0 ~ \text{or}\\ \frac{\rho}{\alpha} + X_I^* & = & n\\ X_I^* & = & n - \frac{\rho}{\alpha} \end{eqnarray} The first solution is boring --- if nobody's infected, nobody gets infected. The other solution is more interesting: it gives the level of infection where new infections just balance recoveries (on average). Whether$X_I(t)$will tend to converge to this level is however another question. Obviously this is only possible if$n > \rho/\alpha+1$, so let's assume that going forward. Suppose that$X_I(t) = X_I^* + \epsilon$for some small amount$\epsilon$. You can show (Exercise \ref{exercise:restoring-force}) that the expected change in$X_I$has the opposite sign from$\epsilon$, so that if the process is above$X_I^*$it will tend to be pulled down towards it, and vice versa. This suggests that the process should bounce around this level. \begin{figure} <>= sim.SIS <- function(n, alpha, rho, xi.1=1, T) { xi <- vector(length=T) xi[1] <- xi.1 for (t in 2:length(xi)) { y.si <- rbinom(n=1, size=n-xi[t-1], prob=alpha*xi[t-1]) y.is <- rbinom(n=1, size=xi[t-1], prob=rho) xi[t] <- xi[t-1] + y.si - y.is } return(xi) } SIS.sims <- replicate(100, sim.SIS(1e5, 1e-6, 1e-2, 1, 400)) plot(0, xlab="t", ylab=expression(X[I](t)), type="n", ylim=c(0,1e5), xlim=c(1, nrow(SIS.sims))) for (sim in 1:ncol(SI.sims)) { lines(1:nrow(SIS.sims), SIS.sims[,sim], col="grey") } @ \caption{100 trajectories of the SIS model, with$n=10^5$,$\alpha = 10^{-6}$,$\gamma = 10^{-2}$(so$X_I^* = 9\times 10^{4}$) and$X_I(1) = 1$. Notice that there are {\em two} horizontal lines at large$t$.} \label{fig:sis-synopsis} \end{figure} \begin{figure} <>= plot(0, xlab="t", ylab=expression(X[I](t)), type="n", ylim=c(8.5e4,9.5e4), xlim=c(200, nrow(SIS.sims))) for (sim in 1:ncol(SI.sims)) { lines(1:nrow(SIS.sims), SIS.sims[,sim], col="grey") } @ \caption{Close-up of Figure \ref{fig:sis-synopsis}, showing fluctuations around$X_I^*$at large times.} \end{figure} \begin{figure} <>= plot(0, xlab="t", ylab=expression(X[I](t)), type="n", ylim=c(0,10), xlim=c(1, 200)) for (sim in 1:ncol(SI.sims)) { lines(1:nrow(SIS.sims), SIS.sims[,sim], col="grey") } @ \caption{Close-up of Figure \ref{fig:sis-synopsis}, showing that some simulations converge to zero.} \end{figure} In fact, the situation is a little more subtle.$X_I = 0$is still an absorbing state, but now it is one which {\em can} be reached from other states of the population. If, after all,$X_I(t) = 1$, there is some probability that$X_I(t+1) = 0$. (What is that probability?) And no matter how big$X_I$gets, there is always some probability that it will eventually reach 1, and then in turn 0. So$X_I=0$is the only closed, irreducible set of states. Consequently, {\em the disease will always die out} eventually in an SIS model. It will, however, generally take a very long time to do so, and in the meanwhile it will spend most of its time oscillating around$X_I^*$. \subsection{SIR, SEIR, etc.} It is easy, and often convenient, to add more compartments to epidemic models of this sort. In an SIR model, for example, the third compartment,$R$, stands for removed'' or recovered'', depending on whether we think of individuals in that compartment as having died off, or as having acquired immunity. The basic dynamics are \begin{eqnarray} p_{SI} & = & \alpha X_I\\ p_{IR} & = & \rho\\ P_{SR} & = & 0\\ p_{IS} & = & 0\\ P_{RI} & = & 0\\ p_{RS} & = & 0 \end{eqnarray} I will let you check that the only absorbing state is$X_R = n$, and also let you work out both the deterministic approximation and simulate some trajectories on the way there. A related wrinkle is to add an intermediate compartment between$S$and$I$, say$E$for exposed'', in which someone is not yet showing symptoms. Individuals in compartment$E$might or might not be infectious themselves (and, if so, might be more or less infectious than those in compartment$I$). If one wants to have characteristic lengths of time for each phase of the disease, one can add even more compartments, say$I_1, I_2, \ldots I_{\tau}$, with automatic progression from one to the next. \section{Space} The easiest way to handle space in a compartment model is to say that you have$r_1$types or classes of individuals across$r_2$locations, so that the$r=r_1 r_2$compartments track combinations of types and locations. Migration from one location to another is then just another kind of transition. The transition rates can be made to be functions of the counts of types only in the same location (e.g.,$S$'s in location 1 can only be infected by$I$'s in location 1), or global aggregates calculated across all location, or some combination of both of these. This approach will work poorly when the number of locations$r_2$will become comparable to the population size$n$, since then most locations will have a count of 0 for most types, and you would be better off using a different, more individual-based model. \section{Asymptotics} We haven't discussed what happens as$n\rightarrow\infty$. Basically, a compartment model converges on a system of ordinary differential equations. Said a bit less baldly, suppose that the transition rates$p_{ij}(X)$are all of the form$p_{ij}(X) = \rho_{ij}(X/n) h$, where$h$represents the amount of time that passes from one value of$X$to the next. Notice that we are assume that the transition rates are really functions of the {\em distribution} of individuals across the compartments, and not the {\em numbers} in compartments --- that we get the same rate of infection when$n=20$and$X_I=10$as when$n=2000$and$X_I=1000$. Finally, we fix a finite, but perhaps large, interval of time$T$. Then what we find is that the trajectory of the compartment model between time$0$and time$T$approaches the solution of a differential equation. If we say that$x(t)$is the solution to $$\frac{dx_i}{dt} = \sum_{j}{\rho_{ji}(x(t)) x_j(t) - \rho_{ij}(x(t)) x_i(t)}$$ then, as$n\rightarrow\infty$and$h\rightarrow 0$, the random sequence$X(h)/n, X(2h)/n, X(3h)/n, \ldots X(T)/n$comes closer and closer to the deterministic function$x$on the interval$[0,T]$(with high probability). This is the trick that I used above, when I invoked deterministic approximations'' to the epidemic models. For instance, in the SI model, the limiting trajectory for$(X_S(t)/n, X_I(t)/n)$is given by \begin{eqnarray} \frac{dx_S}{dt} & = & - ax_S(t)(1-x_S(t))\\ \frac{dx_I}{dt} & = & ax_I(t)(1-x_I(t))\\ x_S(t) + x_I(t) & = & 1\\ \end{eqnarray} and solving this differential equation does indeed give us sigmoidal growth curves. (Can you work out how the$a$here relates to the$\alpha$in the stochastic, finite-$n$version of the SI model?) It is important that the convergence only holds over finite times$[0,T]$. Basically, this is because infinite time gives a stochastic process infinitely many opportunities to do arbitrarily improbable things, so they will eventually. In particular, in the SIS model, the limiting behavior is \begin{eqnarray} \frac{dx_I}{dt} & = & ax_I(t) (1-x_I(t)) - g x_I(t) \end{eqnarray} and not extinction. \section{Demography: Birth, Aging, and Death} \label{sec:demography} So far, we have considered models where the population size$n$is constant. Most real populations change in size, and an important application of compartment models is to demography, where what we really care about is how$n$changes over time. The three processes which change$n$are birth, death, and migration. The SIR model shows how to handle death: we can treat being dead'' as just another compartment, say$R$\footnote{Saying that$R$stands for removed'' is euphemistic.}. Individuals in compartment$i$transition to this compartment at some rate$\delta_i$, and there are no transitions out of this compartment. Demographers typically define the {\bf mortality rate} as the number of deaths per 1000 individuals per year, which you can translate into a probability per person per unit time (depending on the time-scale of your model).$Y_{iR}(t)$would be the random variable which counted the actual number of deaths among members of compartment$i$in time-period$t$. If this sort of formal simplicity isn't needed, however, we can simply introduce a new variable$D_i(t)$for deaths in compartment$i$in period$t$. Migration can be handled similarly, by introducing a new compartment for leaving for the rest of the world''. If both in- and out- migration are possible, then we need to model the flow of immigrants. Birth, however, requires a slightly special treatment in these models. The difficulty is that parents are (typically) still around. The easiest way to handle this is to introduce a new set of variables,$B_{ij}(t)$, which count the number of new births into compartment$j$from parents in compartment$i$\footnote{I am writing as though individuals have only one parent, which of course isn't true for (most) vertebrates, including human beings, but the reasons for this will become apparent shortly.} If individuals can only be born one at a time, then we can make$B_{ij}$binomial reflecting a certain rate of parentage for individuals in compartment$i$. If there are a non-trivial number of twins, triplets, etc., it might be better to use something like a Poisson distribution. The model thus looks like this: \begin{eqnarray} Y_{ij}(t+1) & \sim & \Binom(X_i(t), p_{ij}(X(t)))\\ D_i(t+1) & \sim & \Binom(X_i(t), \delta_i(X(t)))\\ B_{ij}(t+1) & \sim & \Binom(X_i(t), \beta_{ij}(X(t)))\\ X_i(t+1) & = & X_i(t) - D_i(t+1) + B_{ii}(t+1) + \sum_{j\neq i}{B_{ji}(t+1) + Y_{ji}(t+1) - Y_{ij}(t+1)} \end{eqnarray} In human demography, and to a lesser extent in population ecology, we need a fairly large number of compartments, which track combinations of age and sex. Demographers conventionally work with 5-year age brackets, so that there is a compartment for females age 0--5 and for males age 0--5, for females age 6--10 and for males age 6--10, and so on all down the line. The transition probabilities are very simple: every individual who does not die moves into the next age group (and the same sex), with probability 1. It is the age-specific death rates$\delta_i$which let us model realistic distributions of life-span; without them, we'd have geometric distributions of longevity. Births can only happen into the 0--5 age brackets, so$B_{ij}=0$unless$j$is one of those two brackets. Because only females give birth,$B_{ij}=0$unless$i$is a female age bracket of child-bearing age. In many applications, the age-fertility rates$\beta_{ij}$will be taken to be constant, but they can be allowed to vary with the number of males, or even with the ratio of males to females, depending on the customs of the population concerned. Note that these rates$\beta_{ij}$should not be confused with the {\bf total fertility rate}, which is the {\em sum} of age-specific rates across child-bearing years. \section{Chemistry} Chemistry provides an important application for compartment models. Here the compartments'' are different types of molecules, or {\bf chemical species}, and transitions are the consequences of reactions. We can write the SI model in this form, with one reaction,$S+I \rightarrow 2I$\footnote{A reaction like this, where the species$I$is involved in a process which makes more of the same species, is called {\bf autocatalytic}.}. We can also write the SIS model as something with two reactions,$S+I\rightarrow 2I$and$I \rightarrow S$. The constant-$n$case corresponds to all reactions having the same number of molecules on both the left- and the right- hand sides (as the SI and SIS cases do). In actual chemistry, though, it's common for the number of molecules to change, as in$2 H_2 + O_2 \rightarrow 2 H_2O$(three molecules go in, two molecules come out). The way to handle this is to use$Y$variables to count the number of times each reaction happens, and then subtract the substrates (on the left) and add the products (on the right), so, e.g., if the hydrogen-and-oxygen-to-water reaction was the only one, we'd have \begin{eqnarray} X_{H_2}(t+1) & = & X_{H_2}(t) - 2 Y_{2 H_2 + O_2 \rightarrow 2 H_2O}(t+1)\\ X_{O_2}(t+1) & = & X_{O_2}(t) - Y_{2 H_2 + O_2 \rightarrow 2 H_2O}(t+1)\\ X_{H_2 O}(t+1) & = & X_{H_2 O}(t) + 2 Y_{2 H_2 + O_2 \rightarrow 2 H_2O}(t+1) \end{eqnarray} (We would also typically assume that the transition rate for this reaction is$\propto X_{H_2}^2 X_{O_2}$.) \section{Further reading} For a fine textbook treatment of compartment models in biology, see \citet{Ellner-Guckenheimer}, which includes a good chapter on epidemic models, and references to the further literature. % Memetics, epidemics, ecological models, social learning, Keizer on chemistry... \section{Exercises} To think through, as complements to the reading, not to hand in. \begin{enumerate} \item \label{exercise:size-of-state-space} {\em Compartment-model state spaces are very large} Fix$n$. A possible state$X$is a vector of non-negative integers$[X_1, X_2, \ldots X_r]$, with the constraint that$\sum_{i=1}^{r}{X_i} = n$. Prove that the number of such vectors is$\left( \begin{array}{c} n+r-1\\ n\end{array}\right)$. \begin{enumerate} \item Imagine writing out a line of$n$dots, and inserting$r-1$vertical bars between the dots, with the possibility of bars being next to each other. Call the number of dots to the left of the left-most bar$X_1$, the number of dots between that first bar and the second$X_2$, and so on. Convince yourself that every arrangement of dots and bars corresponds to a valid value of$X$, and that every valid value of$X$can be represented in this way. \item Show that the number of arrangements of dots and bars is equal to choosing$n$locations for dots out of$n+r-1$places. \item Show that$\left( \begin{array}{c} n+r-1\\ n\end{array}\right) = \left( \begin{array}{c} n+r-1\\ r-1\end{array}\right)$. \item Use Stirling's approximation,$\log{n!} \approx n\log{n}$, to show that as$n$grows with$r$fixed, the size of the state space grows like$n^{r-1}$. \end{enumerate} \item \label{exercise:heterogeneous-transition-rates} {\em Randomly-varying transition rates across individuals and over-dispersion} Suppose that each individual in compartment$i$has its own, random transition rate$P_{ij}$for moving to compartment$j$, with$\Expect{P_{ij}} = p_{ij}$and$\Var{P_{ij}} = v_{ij}$. Show that$\Var{Y_{ij}(t+1)|X(t)} = X_i(p_{ij}(1-p_{ij}) + v_{ij})$. \item \label{exercise:absorbing-states} Show the following for the SI model of \S \ref{sec:SI}. \begin{enumerate} \item$X_I = 0$and$X_I = n$are absorbing states. \item There are no other absorbing states. \item An absorbing state forms a closed, irreducible set of states. (This is a general fact about Markov chains.) \item This chain has no other closed, irreducible sets. \item With probability 1, either$X_I(t) \rightarrow 0$or$X_I(t) \rightarrow n$. \end{enumerate} \item \label{exercise:restoring-force} For the SIS model, calculate$\Expect{X_I(t+1) - X_I(t)|X_I(t) = X_I^* + \epsilon}$in terms of$n$,$\alpha$,$\rho$and$\epsilon$, and show that (for small$\epsilon$) it has the opposite sign from$\epsilon$. \item \label{exercise:constant-rates-and-exponentials} \begin{enumerate} \item Prove that if the probability of an event happening on each time step is a constant$p$, and successive time-steps are independent, the number of steps we need to wait for the event to happen has a geometric distribution. Find the parameter in terms of$p\$. \end{enumerate} \end{enumerate} \bibliography{locusts} \bibliographystyle{crs} \end{document}