Historical advances in machine learning

(history, philosophy, statistics, optimization, probability, and more)

Location/time (in person): GHC 5222 from 3pm to 6pm ET, every Wednesday.

This class may be recorded for internal class use only.

Instructor: Aaditya Ramdas, TA: Stephanie Milani

Course description We will read (before class) and discuss (in class) a variety of historically important papers in ML (and to some extent AI). Not all of these were initially published in the ML/AI literature (eg: Bellman in math, VC in probability, bandits in statistics, fuzzy sets in control, optimization work in OR, etc, but now play central roles in ML and/or AI).

Since “historical” is always ambiguous, we’re going to go with “presented/published before the instructor was born” as a definition (pre-1988). While the content of the paper will be the primary focus, we will also attempt to understand the research context in which the paper was written.

For example, what questions were other researchers asking at the time? Was the paper immediately recognized as a breakthrough or did it take a long time? Do we view the contents of the paper today as “obvious in hindsight” or is there still a lot of material in the paper that is nontrivial and even surprising or underappreciated? Who was the author, were they already relatively well known when they wrote the paper, or was it the paper itself that made them famous? What else did these authors work on before/after the paper?

Target audience: PhD students who are interested in some subset of these topics.

Prerequisites: Basic graduate level (or advanced undergrad) training in machine learning, statistics, probability. This is not an introductory course in machine learning and presumes that you have already completed a full course in ML (like 601, 701, 715, etc). The course targets PhD students in any department with the appropriate background, but advanced masters or undergraduate students are also welcome.

This course aims to develop skills orthogonal to many other ML classes, such as reading papers, writing reviews, speaking skills, discussion skills, etc. The class will thus have a higher weekly reading, class participation, speaking and writing load than most other ML classes, but a lower exam load.

Course relevance: For students pursuing (or intending to pursue) research in ML/AI, it is useful to know the roots of our field, to understand what makes big ideas big, how those authors communicated those ideas, etc. It will result in a deeper appreciation of our own field, but also may help you understand certain things that are hard to teach, like how to pick and work on important problems, and how you know that your solution might be any good. You will be surprised by how rich some of the “old papers” are. Last, a lot is often lost in translation, and it is common wisdom to read the original authors themselves if possible; it will build a certain breadth and depth to be able to process fundamental works from many subareas of ML.