Stochastic Processes (Advanced Probability II), 36-754

Spring 2007

TuTh 9:00--10:20, in 232Q Baker Hall

Prof. Cosma Shalizi


Snapshot of a non-stationary spatiotemporal stochastic process (the Greenberg-Hastings model)

Stochastic processes are collections of interdependent random variables. This course is an advanced treatment of such random functions, with twin emphases on extending the limit theorems of probability from independent to dependent variables, and on generalizing dynamical systems from deterministic to random time evolution. Familiarity with measure-theoretic probability (at the level of 36-752) is essential, but the emphasis will be on developing a sound yet intuitive understanding of the material, rather than on analytic rigor.

The first part of the course will cover some foundational topics which belong in the toolkit of all mathematical scientists working with random processes: random functions; stationary processes; Markov processes and the stochastic behavor of deterministic dynamical systems (i.e., "chaos"); the Wiener process, the functional central limit theorem, and the elements of stochastic calculus. These will be followed by a selection of less common but highly useful topics: ergodic theory, which extends the classical limit laws to dependent variables; information theory, as it connects to statistical inference and to limiting distributions; and large deviations theory, which uses information theory to give rates of convergence in the limit laws.

Other topics may be covered in the class through extra lectures, depending on time and enthusiasm. Possibilities include: prediction theory, especially predictive state representations of non-Markovian processes; empirical process theory; random fields in space and time; and random graphs.

Prerequisites: As mentioned, measure-theoretic probability, at the level of 36-752, is essential. I will also presume some familiarity with basic stochastic processes, at the level of 36-703 ("Intermediate Probability"), though I will not assume those memories are very fresh.

Audience: The primary audience for the course are students of statistics, and mathematicians interested in stochastic processes. I hope, however, that it will be useful to any mathematical scientist who uses probabilistic models; in particular we will cover stuff which should help physicists (statistical mechanics, nonlinear dynamics); computer scientists (machine learning, information theory networks); engineers (communication, control, signal processing); economists (evolutionary game theory, finance); population biologists, etc.

Grading: Two to four problems will be assigned every other week. These will either be proofs, calculations, or simulation exercises, with an emphasis on the last. Students whose performance on homework is not adequate will have the opportunity to take an oral final exam in its place.

Homework will be due in my mailbox at midnight on the date stated, unless otherwise noted.

Office Hours: Thursdays at 3:30 in 229C Baker Hall.

Text: The primary text will be the lecture notes (see below). Kallenberg's Foundations of Modern Probability (2nd ed.) will be used as a reference and a source of problems. (It's on order at the CMU bookstore, but often cheaper at Powell's.) Students who took 36-752 last semester, when it used Ash and Doleans-Dade, will also find the last two chapters helpful. Another good book is Fristedt and Gray's A Modern Approach to Probability Theory. Some material will also be drawn from the following books, available online:

The notes from 36-752 for the fall of 2005 (PDF, 701k) are also useful.

Outline

Subject to change, to accommodate the interests and progress of the class.
Basics, 16--18 January
Definition of stochastic processes, examples, random functions. Finite-dimensional distributions (FDDs) of a process, consistency of a family of FDDs. Theorems of Daniell and Kolmogorov on extending consistent families to processes. Probability kernels and regular conditional probabilities, extending finite-dimensional distributions defined recursively through kernels to processes (the Ionescu Tulcea theorem).
One-Parameter Processes, Usually Functions of Time, 23--30 January
Examples of one-parameter processes. Shift-operator representations of one-parameter processes. Three kinds of stationarity, the relationship between strong stationarity and measure-preserving transformations (especially shifts). Reminders about filtrations and optional times; various sorts of waiting times; Kac's Recurrence Theorem. Kinds of continuity; versions of stochastic processes; difficulties of continuity; the notion of a separable random function. Existence of separable, cadlag and continuous modifications.
Markov Processes, 1-22 February
Markov processes and their transition-probability semi-groups. Markov processes as transformations of IID noise. Markov processes as operators on function spaces. Examples of Markov processes (Wiener process and the logistic map). Generators of homogeneous Markov processes, analogy with exponential functions. The strong Markov property and the martingale problem. Markov families. An example of a Markov process which is not strongly Markovian. Feller processes. Convergence in distribution of cadlag processes. Convergence of Feller processes, approximation of differential equations by Markov processes (and vice versa). Convergence of random walks, functional central limit theorem.
Stochastic Calculus, Diffusions, and Spectra, 27 February--27 March
Diffusions. Wiener measure; non-differentiability of almost all continuous curves.Heuristic approach to stochastic integrals (via Euler's method). Rigorous approach to stochastic integrals. Examples. Ito's formula for change of variables. Stochastic differential equations, existence and uniqueness of solutions. Physical Brownian motion: the Langevin equation, Ornstein-Uhlenbeck processes. More on SDEs: diffusions, forward (Fokker-Planck) and backward equations. White noise. Spectral analysis; how the white noise lost its color. Mean-square ergodicity. Small-noise limits for SDEs: convergence in probability to ODEs, and our first large-deviations calculations.
Ergodic Theory, 29 March--10 April
Introduction to ergodic properties and invariance. The almost-sure (Birkhoff) ergodic theorem. Metric transitivity. Examples of ergodic processes. Ergodic decompositions. More on ergodic decompositions. Ergodic components as minimal sufficient statistics. Mixing. Convergence in distribution and decay of correlations. Central limit theorem for strongly mixing sequences.
Special lecture by Leo Kontorovich: "Strong Mixing and Concentration of Measure"
Information Theory, 12--17 April
Introduction to information theory. Relations between Shannon entropy, relative entropy/Kullback-Leibler divergence, expected likelihood and Fisher information. Entropy rate. The asymptotic equipartition property, a.k.a. the Shannon-MacMillan-Breiman theorem, a.k.a. the entropy ergodic theorem. Asymptotic likelihoods.
Large Deviations Theory, 24 April--3 May
Large deviations principles and rate functions; Varadhan's Lemma. Breeding LDPs: contraction principle, "exponential tilting", Bryc's Theorem, projective limits. IID large deviations: cumulant generating functions, Legendre's transform, the return of relative entropy; Cramer's theorem on large deviations of empirical means, Sanov's theorem on large deviations of empirical measures, process-level large deviations. Large deviations for Markov sequences via expoential-family densities. Large deviations for weakly-dependent sequences (Gartner-Ellis theorem). Large deviations of stochastic differential equations in the small-noise limit (Freidlin-Wentzell theory). Large deviations in hypothesis testing and in nonlinear filtering.

Lecture Notes in PDF

Notes will be posted here before each lecture. The notes from last year are available. However, I will be revising them as we go along, and the division into lectures will differ somewhat, so you shouldn't completely rely on them rather than coming to class!

Contents
Including a running list of all definitions, lemmas, theorems, corollaries, examples and exercises to date.
Chapter 1: Stochastic Processes and Random Functions (16 January)
Stochastic processes as indexed collections of random variables and as random functions. Sample paths and constraints on them. Examples.
Chapter 2: Building Infinite Processes from Finite-Dimensional Distributions (18 January)
Finite-dimensional distributions of a process. Theorems of Daniell and Kolmogorov on extending finite-dimensional distributions to infinite-dimensional ones.
Chapter 3: Building Infinite Processes by Recursive Conditioning (23 January)
Probability kernels and regular conditional probabilities. Theorem of Ionescu Tuclea on constructing processes from regular conditional distributions.
Chapter 4: One-Parameter Processes (23 January)
One-parameter processes; examples thereof. Representation of one-parameter processes in terms of shift operators.
Chapter 5: Stationarity and Dynamics (25 January)
Strong, weak, and conditional stationarity. Stationarity as shift-invariance. Measure-preserving transformations and stationary processes.
Chapter 6: Random Times and Recurrence (30 January)
Reminders about filtrations and stopping times. Waiting times of various sorts, especially recurrence times. Poincaré and Kac recurrence theorems. "The eternal return of the same" and its statistical applications.
Chapter 7: Continuity (1 February)
Kinds of continuity for stochastic processes. Versions and modifications of stochastic processes. Benefits of continuous sample paths, and an example of the impossibility of deducing them from the finite dimensional distributions alone. Separable random functions.
Chapter 8: More on Continuity (1 February)
Existence of separable modifications of a stochastic process (in detail). Idea of measurable modifications. Conditions for the existence of measurable, cadlag and continuous modifications.
Chapter 9: Markov Processes (6 February)
Definition and meaning of the Markov property. Transition probability kernels. Existence of Markov processes with specified transitions. Invariant distributions. Dependence of the Markov property on filtrations.
Chapter 10: Two Views of Markov Processes (8 February)
Markov sequences as transformations of noise; transducers. Markov processes as collections of operators: Markov operator semi-groups evolve the distribution of states, and their adjoint operators evolve the "observables", the bounded measurable functions of the state. Some functional-analytic facts about conjugate spaces and adjoint operators.
Chapter 11: Examples of Markov Processes (13 February)
The logistic map as an example of turning nonlinear, deterministic dynamical systems into linear Markov operators. The Wiener process as an example of finding the transition kernels and time-evolution operators. Generalization of the Wiener process example to other processes with stationary and independent increments, and connections to limiting distributions of sums of IID random variables.
Chapter 12: Generators (15 February)
The generators of the semi-groups associated with Markov processes: analogy with exponential functions, how to find generators starting from semi-groups, some uses of generators for solving differential equations, Laplace transforms and resolvents. Hille-Yosida theorem on building semi-groups from generators.
Chapter 13: Strongly Markovian Processes and Martingale Problems (20 February)
The strong Markov property is being Markovian even at random times. An example of how a Markov process can fail to be strongly Markovian. The concept of a "martingale problem". Relationship between solutions of martingale problems and strong Markov processes.
Chapter 14: Feller Processes (22 February)
Clarificiations on initial states and distributions of Markov processes; Markov families, the probability kernel from initial states to paths. Definition of Feller processes and its physical motivations; reformulation in terms of semi-groups; unique correspondence between Feller processes and their generators. Attributes of Feller processes: cadlag sample paths, strong Markov property, Dynkin's formula.
Chapter 15: Convergence of Feller Processes (27 February)
Weak convergence of stochastic processes; hints as to the Skorokhod topology on cadlag functions; necessary and sufficient, and merely sufficient, conditions for convergence in distribution of cadlag processes. Convergence in distribution of Feller processes. Convergence of discrete-time Markov processes on Feller processes. Convergence of Markov processes on ordinary differential equations.
Chapter 16: Convergence of Random Walks (27 February and 1 March)
The Wiener process as a Feller process. Continuous-time random walks. Convergence of random walks to the Wiener process via the Feller-process machinery; via direct use of the theorems on weak convergence.
Chapter 17: Diffusions and the Wiener Process (6 March)
Definition of diffusions. The Wiener process as the prototypical diffusion. Resume of the Wiener process's properties. Wiener processes with respect to arbitrary filtrations. Gaussian processes. Wiener measure. Non-dfferentiability of almost-all continuous functions.
Chapter 18: Preview of Stochastic Integrals (8 March)
Why we want stochastic integrals. A heuristic approach via Euler's method (the Euler-Bernstein scheme).
Chapter 19: Stochastc Integrals and Stochastic Differential Equations (20--29 March)
Rigorous approach to stochastic integrals, after Ito. Ito integrals of "elementary" processes; extension to a wider class of integrands via approximation. Ito's isometry. Some simple but instructive examples. Ito processes. Ito's formula for change of variables. Stratonovich integrals (briefly). Representation of nice martingales as Ito integrals. Stochastic differential equations: existence and uniqueness of solutions. A more realistic model of Brownian motion, leading to a stochastic differential equation (the Langevin equation) and Ornstein-Uhlenbeck processes.
Chatper 20: More on Stochastic Differential Equations (29 March, 3 and 5 April)
Solutions of SDEs are Feller diffusions (as they solve martingale problems). The Kolmogorov "forward" and "backward" equations, for the evolution of the probability density and the observables, respectively. Examples of the forward or Fokker-Planck equation and its solution.
Chapter 21: A First Look at Small-Noise Limits of Stochastic Differential Equations (10 April)
SDEs defined by adding a small amount of white noise to ODEs. Solutions of the SDEs converge in distribution on the solutions of the ODE as the noise goes to zero (via Feller properties). An exponential upper bound on the probability of given magnitude of deviation between the solutions. Preview of coming attractions in large deviations theory.
Chapter 22: Spectral Analysis and Mean-Square Ergodicity (12--24 April)
"White noise" as a linear functional; its description as a generalized stochastic process (mean zero, uncorrelated, Gaussian) and equivalence to the Ito integral. Spectral representation of stochastic processes, especially weakly stationary ones, or, random Fourier transforms. The Wiener-Khinchin theorem linking the spectrum to the autocorrelation function. How the white noise lost its color. Our first ergodic theorem: convergence of time averages in mean square. A first proof assuming rapid decay of correlations. A stronger second proof based on the spectral representation.
Chapter 23: Ergodic Properties (26 April and 1 May)
Ideological remarks on ergodic theory. Dynamical systems and their invariants; connections to Markov processes. Ergodic properties of functions and ergodic limits. Asymptotic mean stationarity.
Chapter 24: Birkhoff's Ergodic Theorem (4 May)
The almost-sure or individual ergodic theorem of Birkhoff: in asymptotically mean-stationary systems, the time average of a well-behaved observable converges to its expectation, with probability 1.
References
Confined to works explicitly cited to date.
The entire set of notes, including future chapters.

Homework Assignments

    Exercises 1.1, 3.1, 3.2, 4.1 and 4.2 from the notes. Due 2 February.
  1. Exercises 6.3, 9.5, 9.6, 9.7, 9.9, 10.4, 11.5, 11.6, 11.8, 11.10, 12.3. Due 2 March.
  2. A separate PDF file. Due 14 May.

An RSS feed for the notes