10-702 Statistical Machine Learning, Spring 2007

10-702 Statistical Machine Learning
Instructor: Larry Wasserman
Time: MW 1:30-2:50
Place: Wean Hall 4623
TA: Jingrui He
Office hours: Thursdays 10-12
Place: Wean Hall 8102
TA: Robin Sabhnani
Office hours: Thursdays 6:30-8:30pm
Place: Newell Simon Hall 3122
Course secretary: Sharon Cavlovich
Office: Wean Hall 5315

Introduction to Statistical Learning Theory (Bousquet, Boucheron and Lugosi) Here

Midterm: Monday during class time in the usual classroom

Practice Test Here

Course description
Statistical Machine Learning is a second graduate level course in machine learning, assuming students have taken Machine Learning (10-701) and Intermediate Statistics (36-705). The term ``statistical'' in the title reflects the emphasis on statistical analysis and methodology, which is the predominant approach in modern machine learning.
The course combines methodology with theoretical foundations. It is intended for students who want to practice the ``art'' of designing good learning algorithms, and also understand the ``science'' of analyzing an algorithm's statistical properties and performance guarantees. Theorems are presented together with practical aspects of methodology and intuition to help students develop tools for selecting appropriate methods and approaches to problems in their own research. The course includes topics in statistical theory that are now becoming important for researchers in machine learning, including consistency, minimax estimation, and concentration of measure.

Prerequisites
Machine Learning 10-701 and Intermediate Statistics 36-705, or Probability and Statistics 36-725 and 36-726.
The syllabus includes information about assignments, exams and grading.
Zoubin's Lectures
Lecture 1 Lecture 2 Lecture 3 Lecture 4 Lecture 5

Lecture Notes
There is no required text for the course. Lecture notes will be regularly distributed (but not posted on the web). These are draft chapters and sections from a book in progress.
Comments, corrections, and other input on the drafts are highly encouraged.
Secondary References:
Chris Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer-Verlag, 2001.

Assignments
Assignments are due on Fridays at 3:00 p.m. Hand in the assignment at course secretary's office in Wean Hall 4609.

Assignment 1: due: Friday, Jan 25, 3:00 p.m.
Assignment 2: due: Friday, Feb 8, 3:00 p.m.
Assignment 3: due: Friday, Feb 22, 3:00 p.m.
Data sets:training data and test data
In R type: load("training.data"); load("test.data")

Assignment 4: due: Friday, March 21 3:00 p.m.

Assignment 5: due: Friday, April 11 3:00 p.m.

Assignment 6: due: Friday, May 9 3:00 p.m.

Solutions for Assignment 1:
Solutions for Assignment 2:
Solutions for Midterm:
Solutions for Assignment 3:
Solutions for Assignment 4:
Solutions for Assignment 5:

Code

Model Selection in R
Nonparametric Classification in R

The R language and code

Download R
R tutorial (Postscript)
R tutorial (pdf)
intoduction to R
a free R manual
R for Beginners by Emmanuel Paradis (Postscript)
R for Beginners by Emmanuel Paradis (pdf)
handy R reference card (Postscript)
handy R reference card (pdf)
How to manipulate matrices in R
Nonparametric local regression using locfit

PCA example

Data

UCI Machine Learning Repository

Papers

Topics
Topics will be chosen from the following basic outline, as announced in class.

Statistical Theory: Maximum likelihood, Bayes, minimax, Parametric versus Nonparametric Methods, Bayesian versus Non-Bayesian Approaches, classification, regression, density estimation.

Convexity and optimization: Convexity, conjugate functions, unconstrained and constrained optimization, KKT conditions.

Parametric Methods: Linear Regression, Model Selection, Generalized Linear Models, Mixture Models, Classification (linear, logistic, support vector machines), Graphical Models, Structured Prediction, Hidden Markov Models

Sparsity: High Dimensional Data and Sparsity, Basis Pursuit and the Lasso Revisited, Sparsistency, Consistency, Persistency, Greedy Algorithms for Sparse Linear Regression, Sparsity in Nonparametric Regression. Sparsity in Graphical Models, Compressed Sensing

Nonparametric Methods: Nonparametric Regression and Density Estimation, Nonparametric Classification, Boosting, Clustering and Dimension Reduction, PCA, Manifold Methods, Principal Curves, Spectral Methods, The Bootstrap and Subsampling, Nonparametric Bayes.

Advanced Theory: Concentration of Measure, Covering numbers, Learning theory, Risk Minimization, Tsybakov noise, minimax rates for classification and regression, surrogate loss functions, boosting, sparsistency, Minimax theory.
Kernel methods: Mercel kernels, reproducing kernel Hilbert spaces, relationship to nonparametric statistics, kernel classification, kernel PCA, kernel tests of independence.
Computation: The EM Algorithm, Simulation, Variational Methods, Regularization Path Algorithms, Graph Algorithms

Other Learning Methods: Functional Data, Semi-Supervised Learning, Reinforcement Learning, Minimum Description Length, Online Learning, The PAC Model, Active Learning

Course Calendar

Week of: Mon Wed Friday

January 14 Stat Theory Review Stat Theory Review

21 No Class (MLK day) Convexity/Optimization Homework 1

28 Linear Models Model Selection

February 4 model selection linear classification Homework 2

11 Mixtures Graphical Models Project Proposal

18 Graphical Models Nonparametric Regression Homework 3

25 Nonparametric Classification Nonparametric Classification

March 3 EXAM No Class

10 No Class No Class

17 Advanced Theory Advanced Theory Homework 4

24 Advanced Theory Dimension Reduction

31 Dimension Reduction The Bootstrap Progress report

April 7 Kernel Methods Kernel Methods Homework 5

14 Nonparametric Bayes Nonparametric Bayes

21 Computation Computation

28 Computation Last Class Submit Project

May 5 Homework 6

Week of:		Mon	Wed	Friday
January	14	Stat Theory Review	Stat Theory Review
	21	No Class (MLK day)	Convexity/Optimization	Homework 1
	28	Linear Models	Model Selection
February	4	model selection	linear classification	Homework 2
	11	Mixtures	Graphical Models	Project Proposal
	18	Graphical Models	Nonparametric Regression	Homework 3
	25	Nonparametric Classification	Nonparametric Classification
March	3	EXAM	No Class
	10	No Class	No Class
	17	Advanced Theory	Advanced Theory	Homework 4
	24	Advanced Theory	Dimension Reduction
	31	Dimension Reduction	The Bootstrap	Progress report
April	7	Kernel Methods	Kernel Methods	Homework 5
	14	Nonparametric Bayes	Nonparametric Bayes
	21	Computation	Computation
	28	Computation	Last Class	Submit Project
May	5			Homework 6