Stochastic Processes (Advanced Probability II), 36-754
TuTh 9:00--10:20, in 232Q Baker Hall
Stochastic processes are collections of interdependent random variables.
This course is an advanced treatment of such random functions, with twin
emphases on extending the limit theorems of probability from independent to
dependent variables, and on generalizing dynamical systems from deterministic
to random time evolution. Familiarity with measure-theoretic probability (at
the level of 36-752) is essential, but the emphasis will be on developing a
sound yet intuitive understanding of the material, rather than on analytic
The first part of the course will cover some foundational topics which
belong in the toolkit of all mathematical scientists working with random
processes: random functions; stationary processes; Markov processes and the
stochastic behavor of deterministic dynamical systems (i.e., "chaos"); the
Wiener process, the functional central limit theorem, and the elements of
stochastic calculus. These will be followed by a selection of less common but
highly useful topics: ergodic theory, which extends the classical limit laws to
dependent variables; information theory, as it connects to statistical
inference and to limiting distributions; and large deviations theory, which
uses information theory to give rates of convergence in the limit laws.
Other topics may be covered in the class through extra lectures, depending
on time and enthusiasm. Possibilities include: prediction theory, especially
predictive state representations of non-Markovian processes; empirical process
theory; random fields in space and time; and random graphs.
Prerequisites: As mentioned, measure-theoretic probability, at the
level of 36-752, is essential. I will also presume some familiarity with basic
stochastic processes, at the level of 36-703 ("Intermediate Probability"),
though I will not assume those memories are very fresh.
Audience: The primary audience for the course are students of
statistics, and mathematicians interested in stochastic processes. I hope,
however, that it will be useful to any mathematical scientist who uses
probabilistic models; in particular we will cover stuff which should help
physicists (statistical mechanics, nonlinear dynamics); computer scientists
(machine learning, information theory networks); engineers (communication,
control, signal processing); economists (evolutionary game theory, finance);
population biologists, etc.
Grading: Two to four problems will be assigned every other week.
These will either be proofs, calculations, or simulation exercises, with an
emphasis on the last. Students whose performance on homework is not adequate
will have the opportunity to take an oral final exam in its place.
Homework will be due in my mailbox at midnight on the date stated, unless
Office Hours: Thursdays at 3:30 in 229C Baker Hall.
Text: The primary text will be the lecture notes (see below).
Kallenberg's Foundations of Modern Probability (2nd ed.) will be
used as a reference and a source of problems. (It's on order at the CMU
bookstore, but often cheaper at Powell's.)
Students who took 36-752 last semester, when it used Ash and Doleans-Dade, will
also find the last two chapters helpful. Another good book is Fristedt and
Modern Approach to Probability Theory. Some material will also be
drawn from the following books, available online:
The notes from 36-752 for the fall of 2005 (PDF,
701k) are also useful.
Subject to change, to accommodate the interests and progress of the class.
- Basics, 16--18 January
- Definition of stochastic processes, examples, random functions.
Finite-dimensional distributions (FDDs) of a process, consistency of a family
of FDDs. Theorems of Daniell and Kolmogorov on extending consistent families to
processes. Probability kernels and regular conditional probabilities,
extending finite-dimensional distributions defined recursively through kernels
to processes (the Ionescu Tulcea theorem).
- One-Parameter Processes, Usually Functions of Time, 23--30 January
- Examples of one-parameter processes. Shift-operator representations of
one-parameter processes. Three kinds of stationarity, the relationship between
strong stationarity and measure-preserving transformations (especially shifts).
Reminders about filtrations and optional times; various sorts of waiting times;
Kac's Recurrence Theorem. Kinds of continuity; versions of stochastic
processes; difficulties of continuity; the notion of a separable random
function. Existence of separable, cadlag and continuous modifications.
- Markov Processes, 1-22 February
- Markov processes and their transition-probability semi-groups. Markov
processes as transformations of IID noise. Markov processes as operators on
function spaces. Examples of Markov processes (Wiener process and the logistic
map). Generators of homogeneous Markov processes, analogy with exponential
functions. The strong Markov property and the martingale problem. Markov
families. An example of a Markov process which is not strongly
Markovian. Feller processes. Convergence in distribution of cadlag processes.
Convergence of Feller processes, approximation of differential equations by
Markov processes (and vice versa). Convergence of random walks, functional
central limit theorem.
- Stochastic Calculus, Diffusions, and Spectra, 27 February--27 March
- Diffusions. Wiener measure; non-differentiability of almost all continuous
curves.Heuristic approach to stochastic integrals (via Euler's method).
Rigorous approach to stochastic integrals. Examples. Ito's formula for change
of variables. Stochastic differential equations, existence and uniqueness of
solutions. Physical Brownian motion: the Langevin equation, Ornstein-Uhlenbeck
processes. More on SDEs: diffusions, forward (Fokker-Planck) and backward
equations. White noise. Spectral analysis; how the white noise lost its
color. Mean-square ergodicity. Small-noise limits for SDEs: convergence in
probability to ODEs, and our first large-deviations calculations.
- Ergodic Theory, 29 March--10 April
- Introduction to ergodic properties and invariance. The almost-sure
(Birkhoff) ergodic theorem. Metric transitivity. Examples of ergodic
processes. Ergodic decompositions. More on ergodic decompositions. Ergodic
components as minimal sufficient statistics. Mixing. Convergence in
distribution and decay of correlations. Central limit theorem for strongly
- Special lecture by Leo Kontorovich: "Strong Mixing and Concentration of
- Information Theory, 12--17 April
- Introduction to information theory. Relations between Shannon entropy,
relative entropy/Kullback-Leibler divergence, expected likelihood and Fisher
information. Entropy rate. The asymptotic equipartition property, a.k.a. the
Shannon-MacMillan-Breiman theorem, a.k.a. the entropy ergodic theorem.
- Large Deviations Theory, 24 April--3 May
- Large deviations principles and rate functions; Varadhan's Lemma. Breeding
LDPs: contraction principle, "exponential tilting", Bryc's Theorem, projective
limits. IID large deviations: cumulant generating functions, Legendre's
transform, the return of relative entropy; Cramer's theorem on large deviations
of empirical means, Sanov's theorem on large deviations of empirical measures,
process-level large deviations. Large deviations for Markov sequences via
expoential-family densities. Large deviations for weakly-dependent sequences
(Gartner-Ellis theorem). Large deviations of stochastic differential equations
in the small-noise limit (Freidlin-Wentzell theory). Large deviations in
hypothesis testing and in nonlinear filtering.
Lecture Notes in PDF
Notes will be posted here before each lecture. The notes from last year
are available. However, I will be revising them as we go
along, and the division into lectures will differ somewhat, so you
shouldn't completely rely on them rather than coming to class!
- Including a running list of all definitions, lemmas, theorems, corollaries,
examples and exercises to date.
- Chapter 1: Stochastic Processes and
Random Functions (16 January)
- Stochastic processes as indexed collections of random variables and as random functions. Sample paths and constraints on them. Examples.
- Chapter 2: Building Infinite Processes from Finite-Dimensional Distributions (18 January)
- Finite-dimensional distributions of a process. Theorems of Daniell and
Kolmogorov on extending finite-dimensional distributions to
- Chapter 3: Building Infinite Processes by Recursive Conditioning (23 January)
- Probability kernels and regular conditional probabilities. Theorem
of Ionescu Tuclea on constructing processes from regular conditional
- Chapter 4: One-Parameter Processes (23 January)
- One-parameter processes; examples thereof. Representation of one-parameter
processes in terms of shift operators.
- Chapter 5: Stationarity and Dynamics (25 January)
- Strong, weak, and conditional stationarity. Stationarity as
shift-invariance. Measure-preserving transformations and stationary
- Chapter 6: Random Times and Recurrence (30 January)
- Reminders about filtrations and stopping times. Waiting times of various
sorts, especially recurrence times. Poincaré and Kac recurrence
theorems. "The eternal return of the same" and its statistical
- Chapter 7: Continuity (1 February)
- Kinds of continuity for stochastic processes. Versions and modifications
of stochastic processes. Benefits of continuous sample paths, and an example
of the impossibility of deducing them from the finite dimensional distributions
alone. Separable random functions.
- Chapter 8: More on Continuity (1 February)
- Existence of separable modifications of a stochastic process (in detail).
Idea of measurable modifications. Conditions for the existence of measurable,
cadlag and continuous modifications.
- Chapter 9: Markov Processes (6 February)
- Definition and meaning of the Markov property. Transition probability
kernels. Existence of Markov processes with specified transitions. Invariant
distributions. Dependence of the Markov property on filtrations.
- Chapter 10: Two Views of Markov Processes (8 February)
- Markov sequences as transformations of noise; transducers. Markov
processes as collections of operators: Markov operator semi-groups evolve the
distribution of states, and their adjoint operators evolve the "observables",
the bounded measurable functions of the state. Some functional-analytic facts
about conjugate spaces and adjoint operators.
- Chapter 11: Examples of Markov Processes (13 February)
- The logistic map as an example of turning nonlinear, deterministic
dynamical systems into linear Markov operators. The Wiener process as an
example of finding the transition kernels and time-evolution operators.
Generalization of the Wiener process example to other processes with stationary
and independent increments, and connections to limiting distributions of sums
of IID random variables.
- Chapter 12: Generators (15 February)
- The generators of the semi-groups associated with Markov processes: analogy
with exponential functions, how to find generators starting from semi-groups,
some uses of generators for solving differential equations, Laplace
transforms and resolvents. Hille-Yosida theorem on building semi-groups from
- Chapter 13: Strongly Markovian
Processes and Martingale Problems (20 February)
- The strong Markov property is being Markovian even at random times. An
example of how a Markov process can fail to be strongly Markovian. The concept
of a "martingale problem". Relationship between solutions of martingale
problems and strong Markov processes.
- Chapter 14: Feller Processes (22 February)
- Clarificiations on initial states and distributions of Markov processes;
Markov families, the probability kernel from initial states to paths.
Definition of Feller processes and its physical motivations; reformulation in
terms of semi-groups; unique correspondence between Feller processes and their
generators. Attributes of Feller processes: cadlag sample paths, strong Markov
property, Dynkin's formula.
- Chapter 15: Convergence of Feller Processes (27 February)
- Weak convergence of stochastic processes; hints as to the Skorokhod
topology on cadlag functions; necessary and sufficient, and merely sufficient,
conditions for convergence in distribution of cadlag processes. Convergence
in distribution of Feller processes. Convergence of discrete-time Markov
processes on Feller processes. Convergence of Markov processes on ordinary
- Chapter 16: Convergence of Random Walks
(27 February and 1 March)
- The Wiener process as a Feller process. Continuous-time random walks.
Convergence of random walks to the Wiener process via the Feller-process
machinery; via direct use of the theorems on weak convergence.
- Chapter 17: Diffusions and the Wiener
Process (6 March)
- Definition of diffusions. The Wiener process as the prototypical
diffusion. Resume of the Wiener process's properties. Wiener processes with
respect to arbitrary filtrations. Gaussian processes. Wiener measure.
Non-dfferentiability of almost-all continuous functions.
- Chapter 18: Preview of Stochastic
Integrals (8 March)
- Why we want stochastic integrals. A heuristic approach via Euler's
method (the Euler-Bernstein scheme).
- Chapter 19: Stochastc Integrals
and Stochastic Differential Equations (20--29 March)
- Rigorous approach to stochastic integrals, after Ito. Ito integrals of
"elementary" processes; extension to a wider class of integrands via
approximation. Ito's isometry. Some simple but instructive examples. Ito
processes. Ito's formula for change of variables. Stratonovich integrals
(briefly). Representation of nice martingales as Ito integrals. Stochastic
differential equations: existence and uniqueness of solutions. A more
realistic model of Brownian motion, leading to a stochastic differential
equation (the Langevin equation) and Ornstein-Uhlenbeck processes.
- Chatper 20: More on Stochastic
Differential Equations (29 March, 3 and 5 April)
- Solutions of SDEs are Feller diffusions (as they solve martingale
problems). The Kolmogorov "forward" and "backward" equations, for the
evolution of the probability density and the observables, respectively.
Examples of the forward or Fokker-Planck equation and its solution.
- Chapter 21: A First Look at Small-Noise
Limits of Stochastic Differential Equations (10 April)
- SDEs defined by adding a small amount of white noise to ODEs. Solutions of
the SDEs converge in distribution on the solutions of the ODE as the noise goes
to zero (via Feller properties). An exponential upper bound on the probability
of given magnitude of deviation between the solutions. Preview of coming
attractions in large deviations theory.
- Chapter 22: Spectral Analysis and
Mean-Square Ergodicity (12--24 April)
- "White noise" as a linear functional; its description as a generalized
stochastic process (mean zero, uncorrelated, Gaussian) and equivalence to the
Ito integral. Spectral representation of stochastic processes, especially
weakly stationary ones, or, random Fourier transforms. The Wiener-Khinchin
theorem linking the spectrum to the autocorrelation function. How the white
noise lost its color. Our first ergodic theorem: convergence of time averages
in mean square. A first proof assuming rapid decay of correlations. A
stronger second proof based on the spectral representation.
- Chapter 23: Ergodic Properties (26 April and 1 May)
- Ideological remarks on ergodic theory. Dynamical systems and their
invariants; connections to Markov processes. Ergodic properties of
functions and ergodic limits. Asymptotic mean stationarity.
- Chapter 24: Birkhoff's Ergodic
Theorem (4 May)
- The almost-sure or individual ergodic theorem of Birkhoff: in
asymptotically mean-stationary systems, the time average of a well-behaved
observable converges to its expectation, with probability 1.
- Confined to works explicitly cited to date.
- The entire set of notes, including
Exercises 1.1, 3.1, 3.2, 4.1 and 4.2 from the notes. Due 2 February.
- Exercises 6.3, 9.5, 9.6, 9.7, 9.9, 10.4, 11.5, 11.6, 11.8, 11.10, 12.3.
Due 2 March.
- A separate PDF file. Due 14 May.
An RSS feed for the notes