Complex systems are ones with a large effective number of strongly-interdependent variables. This excludes both low-dimensional systems, and high-dimensional ones where the variables are either independent, or so strongly coupled that only a few variables effectively determine all the rest. Since the 1980s, an interdisciplinary movement of physicists, mathematicians, economists, computer scientists, biologists, anthropologists and other scientists has explored techniques for modeling a broad range of such systems, and their common features and inter-connections. These techniques rely heavily on intensive, sophisticated computer simulations, and notions of information, search and adaptation feature prominently in the theories.

Complex systems can now point to a solid record of scientific accomplishment, improving our understanding of processes ranging from pattern formation in chemical oscillators and metabolic networks to ecological succession, Balinese agriculture, and the persistence of concentrated poverty in wealthy societies. Someone wishing to assimilate these results can now find reasonable textbooks on the construction of such models, as well as on the mathematical foundations of the complex systems approach, and a range of excellent specialized monographs.

What is not available, either in books or in the journal literature, is any systematic treatment of the statistical analysis of these models: that is, how to fit, test, compare and otherwise evaluate these models in the light of data from the real world. Within complex systems, it is increasingly recognized that confronting models with data is crucial to further progress, but almost no one in the field has been trained in modern methods of statistics, which has evolved considerably beyond fitting straight lines subject to independent additive Gaussian noise. Not so coincidentally, the period during which the field of complex systems developed was also the period during which statistical theory coalesced with machine learning, to develop powerful methods for reliably inferring models large numbers of variables which interact in complex, nonlinear ways. The reason this is not a coincidence is that the new statistical learning is also founded on the mathematical theories of information and search, and its applications is also completely reliant on cheap, high-speed computing.

Put slightly differently, there are two essential components to statistical analysis: there must be a class of stochastic models, and inferential procedures for linking the models to data. The new statistical learning theory has developed a range of such procedures, as well as general principles for evaluating their reliability and performance. What complex systems can provide is, precisely, interesting stochastic models of important phenomena. Many of the main complex systems models fall under broad categories which are already familiar in statistics and machine learning (agent-based models can be seen, for instance, as interacting hidden Markov models), but with wrinkles and special features of intrinsic interest.

Complex systems models and statistical learning theory, then, are pretty much made for each other. The purpose of this book is to perform an introduction.

- Introduction
- General ideas of statistical learning and data-mining, including cross-validation and bootstrapping
- Information theory, hypothesis testing, large deviations principle
- Graphical models and conditional independence
- Geometric view of statistical inference, including maximum-likelihood estimation and the EM algorithm
- More advanced theory of statistical learning, emphasizing structural risk minimization and process-oriented evaluation
- Using simulations: Monte Carlo and Indirect Inference
- Power laws and other heavy-tailed distributions
- Time series analysis, prediction, state estimation
- Symbolic dynamics, discrete time series, and the construction of optimal nonlinear models
- Network models: structure
- Cellular automata
- Network models: dynamics
- Agent-based models
- General issues in evaluating complex systems models
- Complexity measures
- Appendix: guide to further reading
- Appendix: review of basics of probability, stochastic processes, and statistical procedures

Page made 2 December 2007; last updated 5 May 2008.